Talk:Tar (file format)

From Wikipedia, the free encyclopedia

This is the talk page for discussing improvements to the Tar (file format) article.

Article policies

Contents

[edit] Error in file size limits ?

The article states "12 bytes reserved for storing the file size, only 11 octal digits can be stored. This gives a maximum file size of 8 gigabytes on archived files." This seems to be in error, 11 decimal digits is 99,999,999,999 Bytes. 99,999,999,999 /1024 =~ 97,656,249 KBytes. 97,656,249 / 1000 =~ 97,656 MegaBytes. 97,656 /1000 =~ 97 GigaBytes. —Preceding unsigned comment added by 194.66.238.27 (talk) 13:27, 13 March 2008 (UTC)

Note that it says octal digits, not decimal digits. 11 octal digits is 8^11-1 = 2^33 - 1 ≈ 8 GB. MHenoch (talk) 15:54, 23 April 2008 (UTC)

Format UID File Size File Name Devn gnu 1.8e19 Unlimited Unlimited 63 oldgnu 1.8e19 Unlimited Unlimited 63 v7 2097151 8GB 99 n/a ustar 2097151 8GB 256 21 posix Unlimited Unlimited Unlimited Unlimited http://www.gnu.org/software/automake/manual/tar/Formats.html 194.66.238.27 (talk) 19:29, 18 March 2008 (UTC)

[edit] No information about the file format

Some irony that an article called Tar file format manages to say exactly nothing about the Tar file format. ==Tagishsimon (talk)

Now it does! (although I am tired so will write about the ustar extensions another day, unless anyone else want to) Sam Jervis 23:18, 6 Jan 2005 (UTC)
IMO, the article should be splitted into "tar (file format)" and "tar (file archiver)" --Minghong 12:41, 23 Mar 2005 (UTC)

I would suggest "tar (utility)" to describe the command-line utility and "tar (file format)" to describe the format. In particular, there are many programs that read and write tar formats that are not the tar utility. This includes various GUI archiving tools, for example. It also includes other tools that are not general-purpose archivers, such as the FreeBSD package tools, which use tar format. (I believe RPM is based on cpio format and Deb uses ar format, but I can't remember exactly.) It's also worth noting that many command-line tar programs can read and write archive formats other than the tar format. Tim Kientzle 00:17, 2 September 2007 (UTC)

[edit] Tar vs Ar

I never knew about ar before I came to Wikipedia. But now that I do, why is tar used instead of ar? :)--Chealer 22:58, 2005 Apr 1 (UTC)

Because traditionally tar was used for backups to tape and ar was used for static libraries. So ar's format is highly tied to ld and static libraries. Furthermore ar archives are not necessarily cross platform due to endianess, and the tar format was standardized through POSIX.--b4hand 16:41, 30 Apr 2005 (UTC)

[edit] Conformance to quality standards of Wikipedia

This article is somewhat of a tutorial to a user of the specific tool in a specific operating system. Moreover words like "Whoops" and examples make this article both lengthy as well below-par. Please consider conformance, or put a {{cleanup|January 2006}} tag to notify people who use this as an encyclopedia and not as a man-page. —Preceding unsigned comment added by Dormant25 (talkcontribs) 06:53, 3 January 2006

I've removed the cleanup tag as there are no examples of colloquialism anymore and AFAICC it is perfectly acceptable to include in an article about a file format how to create and view files in that format. I'm sure most readers want to know how to actually use it to get stuff done, not the other more esoteric details like the technical details of the format and its history. Joe Llywelyn Griffith Blakesley talk contrib 19:10, 6 August 2006 (UTC)
I disagree with Mr. Blakesley. The user who wanted to get straight to using the format would go to the manpage, reference book, tutorial book, operating system documentation, or whatever. The person who is interested in how a tar file is actually laid out, what the advantages/disadvantages of the layout are, etc., without researching it herself - she would be the one to turn to wikipedia - wikipedia would be the natural place to turn, and she is the user we should be focusing on. Now, it is appropriate to have some discussion of what tools people use to manage these archives, and what tools can read or write them, but for a detailed description of how the tools are used (and a more encyclopedic description of how the tools are implemented/how they came to be as they are). I would suggest putting that on a separate page (i.e. tar (file format), tar (unix command), and eventually maybe BSD tar, and GNU tar...). I note that there is already a mistake about BSD tar in the article. Jimmy hartzell 17:27, 15 February 2007 (UTC)

[edit] Compression

The vast majority of instances of the "tar" program that one is likely to encounter now include the "z" (compress) option. Thus the discussion about why this is a bad idea is kind of silly. Tim Bray 05:22, 1 February 2006 (UTC)

[edit] tarball

Does anybody know where the word tarball comes from? --Lionel H. Grillet 12:36, 13 March 2006 (UTC)

Probably from the word... "Tarball", which is similar to a tarbaby. --maru (talk) contribs 14:01, 13 March 2006 (UTC)
Has anyone else noticed the word tarball in the song Bubble Toes by Jack Johnson? —Preceding unsigned comment added by Samineru (talkcontribs) 00:07, April 2, 2006 (UTC)
I was under the impression that tarballs were .tgz files, not .tar files. 71.123.19.163 05:42, 30 June 2006 (UTC)
The term TARBALL is slang, a sort of self depricating joke that requires the following background:
1) In the early days, tar was used to archive programs and program source because end users and their data files were typically stored in other formats (or not at all -- we were programers and one thing we did NOT care about was end users)
2) One of the worst comments you could make about a program design was that it was a MUDBALL -- meaning it had no structure at all, a total dirty mess from start to finish, etc.
3) Ergo your archived source code became known as a tarball
I was there.
It's a shame to let the name tarball become slang for any tar archive. —Preceding unsigned comment added by 63.197.50.90 (talkcontribs) 16:25, 3 August 2006

[edit] Konqueror web archives

I've added an additional file extension, .war, which are Konqueror web archives. Due to the possibility of confusion with Java archives (also .war) I think this needs a citation.

There are throw-away comments on the web that ".war files are renamed .tar.gz" files, especially on KDE discussion lists, and I believe that they are reliable, because I have played with .war files and satisfied myself that they actually are tar files, compressed with gzip. E.g. tar -xzf archive.war will extract the files from one. Problem is, I can't find an authorative source to cite. Maybe the KDE source code?

Also, there are incorrect references on some mailing lists that .war files are "zip files". I've also seen posts made on KDE mailing lists suggesting that Konq should/will use .wtz instead of .war, but I haven't seen any evidence that was ever any more than a proposal.

Assistence in finding an authoritive source to cite will be appreciated. Limeguin 15:21, 24 August 2006 (UTC)

[edit] BSD tar

If my understanding is correct, BSD tar does not need an external program to handle gzip/gunzip, but rather uses the libarchive library it is a wrapper for. If this is in fact the case, the article should be corrected, but I cannot find any documentation that explicitly states that no external program is called. Jimmy hartzell 17:29, 15 February 2007 (UTC)

As the author of libarchive, I can confirm that it uses the libz and libbz2 libraries to implement gzip and bzip2 compression and decompression internally. My bsdtar program (not "BSD tar", by the way) does use libarchive. Kientzle 06:06, 1 September 2007 (UTC)

Additionally, BSD tar does not really exists. The major BSDs all have their own versions of tar. The OpenBSD version, for example, does not support -j --David Chisnall 17:31, 14 August 2007 (UTC)

There are at least two different programs that go by the name "bsdtar": One is a port of an old BSD Unix implementation to MSDOS that was distributed by O'Reilly under the name "bsdtar" as part of a CDROM accompanying one of their books. The other is my own from-scratch tar implementation that I named "bsdtar" because it's released under a BSD license, in contrast to "GNU tar." My "bsdtar" is currently the default system tar for FreeBSD and at least one Linux distro that I'm aware of. Kientzle 06:06, 1 September 2007 (UTC)

[edit] To the person who removed commands section

I thought this section was really useful and think it should be put back. —The preceding unsigned comment was added by 164.165.217.254 (talk) 23:36, 30 March 2007 (UTC).

Heaven forbid that wikipedia ever contains any useful information ever. Only mindless (but "Encyclopedic"!) trivia will be allowed. Grr. Freeking... grk... delete-happy... grumble...
Yeah, useful to all those unix gurus who go to wikipedia before man for usage arguments. All zero of them. Chris Cunningham 15:24, 4 April 2007 (UTC)
1) Man pages are significantly less readable. 2) What about the windows gurus who find themselves stuck in unix for whatever reason? -- See, this is the problem with WP - no one considers even slightly different use-cases. Case in point: various articles on games - deleted because they're "guides for gamers and we don't do that lol" - no one considered (the only slightly out-of-the-box idea) of game designers/developers doing legitimate research (gack you got me started - don't get me started!). —The preceding unsigned comment was added by 60.240.227.227 (talk) 15:47, 4 April 2007 (UTC).
Just get used to it: Wikipedia is an encyclopedia; for other kinds of content, there are (tens of?) thousands wikis out there, which you are equally welcome to edit — or even migrate deleted Wikipedia content to, as Wikipedia is licensed under the GFDL. -- intgr 16:11, 4 April 2007 (UTC)
"intgr" is right. Wikipedia's official policy as stated at WP:NOT#IINFO makes it very clear that Wikipedia articles are not instruction manuals or textbooks. Manpages may be less readable but they're certainly more reliable! Rwxrwxrwx 14:00, 5 April 2007 (UTC)
It might be useful to have command information, but it doesn't belong in an article on the file format. --Brouhaha 18:42, 21 September 2007 (UTC)

[edit] Tgz redirects here, but it not mentioned

I've noticed that Tgz is a redirect to this article, but there is no mention of those three letters anywhere in the article. That's not good. I don't know much about tar, but this should be fixed by someone. -- 199.60.2.105 20:15, 12 June 2007 (UTC)

Explanation added --tcsetattr (talk / contribs) 03:03, 5 September 2007 (UTC)

[edit] How the end of an archive is formatted

The sentence "The end of an archive is marked by at least two consecutive zero-filled blocks." under "Format details" is correct, but gives an incomplete picture of the total size of a tar file.

Per GNU info pages, in addition to having two full blocks of zeros at the end of it, an archive is padded with more zeros as required to make its size a multiple of a "record". A record is a group of blocks, typically 20, which get written to tape in one shot with no spaces between them. The number of blocks can be changed by using the -b option.

The result is that by default, the smallest archive is quite big (10240 bytes), which is interesting and surprising. Daniel Romaniuk 03:07, 21 June 2007 (UTC)

pardon my formatting, but:
bash-3.00$ touch blah
bash-3.00$ tar cvf blah.tar blah
a blah 0K
bash-3.00$ ls -liah | grep blah
660304 -rw-------   1 cc199700 staff          0 Jun 21 09:44 blah
661202 -rw-------   1 cc199700 staff       1.5K Jun 21 09:44 blah.tar
This is Solaris tar, admittedly, but 1.5k isn't 10k. Chris Cunningham 08:49, 21 June 2007 (UTC)
Good to know. According to Solaris documentation, you also can specify a blocking factor (or number of blocks per record) using the -b option. The difference is that yours defaults to 1. Perhaps we could add the explanation about record size, without mentioning what the default might be... Daniel Romaniuk 13:56, 21 June 2007 (UTC)

[edit] tape drive limitations

The article claims that early tape drives only supported 512-byte blocks. This is clearly incorrect for the vast majority of early tape drives, and in fact tar usually wrote 10240-byte tape records. The tape drive and its formatter (controller) had no idea that the 10240-byte record was divided into any smaller unit (block) by the software.

There is probably some good reason that the format was originally designed to use 512-byte blocks, and it might even be due to a limitation of some particular tape drive, but the blanket claim is false and needs to be corrected or removed. I added a 'fact' tag for now, but will remove the claim if support or clarification is not forthcoming. --Brouhaha 18:47, 21 September 2007 (UTC)

[edit] How do you download this file?

Can anybody tell me how to download this file? --WikiCats (talk) 13:47, 22 May 2008 (UTC)