erl_tar

Module

erl_tar

Module summary

Unix 'tar' utility for reading and writing tar archives.

Description

This module archives and extract files to and from a tar file. This module supports the ustar format (IEEE Std 1003.1 and ISO/IEC 9945-1). All modern tar programs (including GNU tar) can read this format. To ensure that that GNU tar produces a tar file that erl_tar can read, specify option --format=ustar to GNU tar.

By convention, the name of a tar file is to end in ".tar". To abide to the convention, add ".tar" to the name.

Tar files can be created in one operation using function create/2 or create/3.

Alternatively, for more control, use functions open/2, add/3,4, and close/1.

To extract all files from a tar file, use function extract/1. To extract only some files or to be able to specify some more options, use function extract/2.

To return a list of the files in a tar file, use function table/1 or table/2. To print a list of files to the Erlang shell, use function t/1 or tt/1.

To convert an error term returned from one of the functions above to a readable message, use function format_error/1.

Unicode Support

If file:native_name_encoding/0 returns utf8, path names are encoded in UTF-8 when creating tar files, and path names are assumed to be encoded in UTF-8 when extracting tar files.

If file:native_name_encoding/0 returns latin1, no translation of path names is done.

Other Storage Media

The ftp module (Inets) normally accesses the tar file on disk using the file module. When other needs arise, you can define your own low-level Erlang functions to perform the writing and reading on the storage media; use function init/3.

An example of this is the SFTP support in ssh_sftp:open_tar/3. This function opens a tar file on a remote machine using an SFTP channel.

Limitations

  • For maximum compatibility, it is safe to archive files with names up to 100 characters in length. Such tar files can generally be extracted by any tar program.

  • For filenames exceeding 100 characters in length, the resulting tar file can only be correctly extracted by a POSIX-compatible tar program (such as Solaris tar or a modern GNU tar).

  • Files with longer names than 256 bytes cannot be stored.

  • The file name a symbolic link points is always limited to 100 characters.

Exports

add(TarDescriptor, Filename, Options) -> RetValue

Types:

TarDescriptor = term()
Filename = filename()
Options = [Option]
Option = dereference|verbose|{chunks,ChunkSize}
ChunkSize = positive_integer()
RetValue = ok|{error,{Filename,Reason}}
Reason = term()

Adds a file to a tar file that has been opened for writing by open/1.

Options:

dereference

By default, symbolic links are stored as symbolic links in the tar file. To override the default and store the file that the symbolic link points to into the tar file, use option dereference.

verbose

Prints an informational message about the added file.

{chunks,ChunkSize}

Reads data in parts from the file. This is intended for memory-limited machines that, for example, builds a tar file on a remote machine over SFTP, see ssh_sftp:open_tar/3.

add(TarDescriptor, FilenameOrBin, NameInArchive, Options) -> RetValue

Types:

TarDescriptor = term()
FilenameOrBin = filename()|binary()
Filename = filename()
NameInArchive = filename()
Options = [Option]
Option = dereference|verbose
RetValue = ok|{error,{Filename,Reason}}
Reason = term()

Adds a file to a tar file that has been opened for writing by open/2. This function accepts the same options as add/3.

NameInArchive is the name under which the file becomes stored in the tar file. The file gets this name when it is extracted from the tar file.

close(TarDescriptor)

Types:

TarDescriptor = term()

Closes a tar file opened by open/2.

create(Name, FileList) ->RetValue

Types:

Name = filename()
FileList = [Filename|{NameInArchive, binary()},{NameInArchive, Filename}]
Filename = filename()
NameInArchive = filename()
RetValue = ok|{error,{Name,Reason}}
Reason = term()

Creates a tar file and archives the files whose names are specified in FileList into it. The files can either be read from disk or be specified as binaries.

create(Name, FileList, OptionList)

Types:

Name = filename()
FileList = [Filename|{NameInArchive, binary()},{NameInArchive, Filename}]
Filename = filename()
NameInArchive = filename()
OptionList = [Option]
Option = compressed|cooked|dereference|verbose
RetValue = ok|{error,{Name,Reason}}
Reason = term()

Creates a tar file and archives the files whose names are specified in FileList into it. The files can either be read from disk or be specified as binaries.

The options in OptionList modify the defaults as follows:

compressed

The entire tar file is compressed, as if it has been run through the gzip program. To abide to the convention that a compressed tar file is to end in ".tar.gz" or ".tgz", add the appropriate extension.

cooked

By default, function open/2 opens the tar file in raw mode, which is faster but does not allow a remote (Erlang) file server to be used. Adding cooked to the mode list overrides the default and opens the tar file without option raw.

dereference

By default, symbolic links are stored as symbolic links in the tar file. To override the default and store the file that the symbolic link points to into the tar file, use option dereference.

verbose

Prints an informational message about each added file.

extract(Name) -> RetValue

Types:

Name = filename()
RetValue = ok|{error,{Name,Reason}}
Reason = term()

Extracts all files from a tar archive.

If argument Name is specified as {binary,Binary}, the contents of the binary is assumed to be a tar archive.

If argument Name is specified as {file,Fd}, Fd is assumed to be a file descriptor returned from function file:open/2.

Otherwise, Name is to be a filename.

extract(Name, OptionList)

Types:

Name = filename() | {binary,Binary} | {file,Fd}
Binary = binary()
Fd = file_descriptor()
OptionList = [Option]
Option = {cwd,Cwd}|{files,FileList}|keep_old_files|verbose|memory
Cwd = [dirname()]
FileList = [filename()]
RetValue = ok|MemoryRetValue|{error,{Name,Reason}}
MemoryRetValue = {ok, [{NameInArchive,binary()}]}
NameInArchive = filename()
Reason = term()

Extracts files from a tar archive.

If argument Name is specified as {binary,Binary}, the contents of the binary is assumed to be a tar archive.

If argument Name is specified as {file,Fd}, Fd is assumed to be a file descriptor returned from function file:open/2.

Otherwise, Name is to be a filename.

The following options modify the defaults for the extraction as follows:

{cwd,Cwd}

Files with relative filenames are by default extracted to the current working directory. With this option, files are instead extracted into directory Cwd.

{files,FileList}

By default, all files are extracted from the tar file. With this option, only those files are extracted whose names are included in FileList.

compressed

With this option, the file is uncompressed while extracting. If the tar file is not compressed, this option is ignored.

cooked

By default, function open/2 function opens the tar file in raw mode, which is faster but does not allow a remote (Erlang) file server to be used. Adding cooked to the mode list overrides the default and opens the tar file without option raw.

memory

Instead of extracting to a directory, this option gives the result as a list of tuples {Filename, Binary}, where Binary is a binary containing the extracted data of the file named Filename in the tar file.

keep_old_files

By default, all existing files with the same name as files in the tar file are overwritten. With this option, existing files are not overwriten.

verbose

Prints an informational message for each extracted file.

format_error(Reason) -> string()

Types:

Reason = term()

Cconverts an error reason term to a human-readable error message string.

init(UserPrivate, AccessMode, Fun) -> {ok,TarDescriptor} | {error,Reason}

Types:

UserPrivate = term()
AccessMode = [write] | [read]
Fun when AccessMode is [write] = fun(write, {UserPrivate,DataToWrite})->...; (position,{UserPrivate,Position})->...; (close, UserPrivate)->... end
Fun when AccessMode is [read] = fun(read2, {UserPrivate,Size})->...; (position,{UserPrivate,Position})->...; (close, UserPrivate)->... end
TarDescriptor = term()
Reason = term()

The Fun is the definition of what to do when the different storage operations functions are to be called from the higher tar handling functions (such as add/3, add/4, and close/1).

The Fun is called when the tar function wants to do a low-level operation, like writing a block to a file. The Fun is called as Fun(Op, {UserPrivate,Parameters...}), where Op is the operation name, UserPrivate is the term passed as the first argument to init/1 and Parameters... are the data added by the tar function to be passed down to the storage handling function.

Parameter UserPrivate is typically the result of opening a low-level structure like a file descriptor or an SFTP channel id. The different Fun clauses operate on that very term.

The following are the fun clauses parameter lists:

(write, {UserPrivate,DataToWrite})

Writes term DataToWrite using UserPrivate.

(close, UserPrivate)

Closes the access.

(read2, {UserPrivate,Size})

Reads using UserPrivate but only Size bytes. Notice that there is only an arity-2 read function, not an arity-1 function.

(position,{UserPrivate,Position})

Sets the position of UserPrivate as defined for files in file:position/2

Example:

The following is a complete Fun parameter for reading and writing on files using the file module:

ExampleFun = 
   fun(write, {Fd,Data}) ->  file:write(Fd, Data);
      (position, {Fd,Pos}) -> file:position(Fd, Pos);
      (read2, {Fd,Size}) -> file:read(Fd, Size);
      (close, Fd) -> file:close(Fd)
   end

Here Fd was specified to function init/3 as:

{ok,Fd} = file:open(Name, ...).
{ok,TarDesc} = erl_tar:init(Fd, [write], ExampleFun),

TarDesc is then used:

erl_tar:add(TarDesc, SomeValueIwantToAdd, FileNameInTarFile),
...,
erl_tar:close(TarDesc)

When the erl_tar core wants to, for example, write a piece of Data, it would call ExampleFun(write, {UserPrivate,Data}).

Note

This example with the file module operations is not necessary to use directly, as that is what function open/2 in principle does.

Warning

The TarDescriptor term is not a file descriptor. You are advised not to rely on the specific contents of this term, as it can change in future Erlang/OTP releases when more features are added to this module.

open(Name, OpenModeList) -> RetValue

Types:

Name = filename()
OpenModeList = [OpenMode]
Mode = write|compressed|cooked
RetValue = {ok,TarDescriptor}|{error,{Name,Reason}}
TarDescriptor = term()
Reason = term()

Creates a tar file for writing (any existing file with the same name is truncated).

By convention, the name of a tar file is to end in ".tar". To abide to the convention, add ".tar" to the name.

Except for the write atom, the following atoms can be added to OpenModeList:

compressed

The entire tar file is compressed, as if it has been run through the gzip program. To abide to the convention that a compressed tar file is to end in ".tar.gz" or ".tgz", add the appropriate extension.

cooked

By default, the tar file is opened in raw mode, which is faster but does not allow a remote (Erlang) file server to be used. Adding cooked to the mode list overrides the default and opens the tar file without option raw.

To add one file at the time into an opened tar file, use function add/3,4. When you are finished adding files, use function close/1 to close the tar file.

Warning

The TarDescriptor term is not a file descriptor. You are advised not to rely on the specific contents of this term, as it can change in future Erlang/OTP releases when more features are added to this module..

table(Name) -> RetValue

Types:

Name = filename()
RetValue = {ok,[string()]}|{error,{Name,Reason}}
Reason = term()

Retrieves the names of all files in the tar file Name.

table(Name, Options)

Types:

Name = filename()

Retrieves the names of all files in the tar file Name.

t(Name)

Types:

Name = filename()

Prints the names of all files in the tar file Name to the Erlang shell (similar to "tar t").

tt(Name)

Types:

Name = filename()

Prints names and information about all files in the tar file Name to the Erlang shell (similar to "tar tv").

© 2010–2017 Ericsson AB
Licensed under the Apache License, Version 2.0.