Talk:Terabyte

From Wikipedia, the free encyclopedia

2TB drive available http://www.lacie.com/products/product.htm?pid=10351 I have updated the text to mention this but I do not know citation protocol so I placed a link to the product here.195.195.5.253 10:10, 22 May 2007 (UTC)

Discussion about centralization took place at Talk:Binary prefix.

Why is this a seprate article??? —Noldoaran (Talk) 03:29, Feb 14, 2004 (UTC)

Contents

[edit] Let's Not Confuse "Binary Prefix" with "SI Prefix"

I just spent the last hour or so repairing inconsistencies in the article in referencing the SI prefix and binary prefix, particularly in the "Quantities of Bytes" template. So sorry you had to work so hard, but thanks for the info.

The SI prefix refers to the modern-day metric system, in which one kilo-<unit> is equal to 1,000 <units>, one mega-<unit> is equal to 1,000 kilo-<units>, etc.

The binary prefix is similar to the metric system which uses "kilo" to denote a thousand, "mega" to denote a million, etc. However, the binary prefix (which is the correct way to denote the number of bits and bytes despite the fact that it is commonly misused and the subject of recent legal disputes) is based on a 2^n premise rather than 10^n; i.e. a "kilobyte" is 1024 bytes, not 1000; a "megabyte" is 1024 kilobytes, etc.

The IEC 60027-2 recently attempted to settle this dispute by declaring the commonly-used misuse to now be accurate because most people and businesses use it anyway. However, the authority of the International_Electrotechnical_Commission to unilaterally make this decision in defiance of long-held standards is largely disputed. --Kris Craig (67.183.207.37 06:11, 12 June 2006 (UTC))

Please read through binary prefix. I think you're misunderstanding something. The "commonly used misuse" is when kilo- = 1024. The IEC certainly did not declare it to be accurate. They said it was wrong and made a new prefix for such usage. The "correct way" to denote bits and bytes is with either a decimal SI prefix (100 kilobytes = 100,000 bytes) or a binary prefix (100 kibibytes = 102,400 bytes). One or the other is probably better, depending on the thing being measured. For instance, memory is always a power of two, so it lends itself to binary prefixes. Other things (hard drive sizes, data rates) don't have an inherent base, and are better measured with SI prefixes.
Also, this discussion should be on Template talk:Quantities of bytes; not here. — Omegatron 12:43, 18 June 2006 (UTC)

The problem is, the IEC is not the sole authority on the matter, as I mentioned in my changes. The use of kilo- = 1024 was set by computer scientists and manufacturers long before they decided it was "wrong". And even then, it primarily in response to lobbying from hard drive manufacturers who wanted to overstate the disk capacity of their drives. My changes kept the IEC changes in the article, but put them into an accurate perspective. 67.183.207.37

kilo- has meant 1000 since the Greeks. It's fine to use kilo- = 1024 as an approximation or colloquialism, but it's definitely wrong where standards and precision are involved. It's expressly prohibited by the BIPM (SI), for instance. Our job is to report on things in an accurate way. The template and the article describe both the incorrect common usage and the more correct, but less common usage. Saying that the incorrect usage is the only usage is wrong and a form of advocacy, which we don't allow here.
Also, if you can provide any evidence whatsoever of an intentional "overstatement" of hard drive sizes by manufacturers, or of them "lobbying" for the IEC prefixes, I'd love to see it.
Please discuss changes to the template on the template's talk page, not here. — Omegatron 02:16, 19 June 2006 (UTC)

[edit] Origins of SI prefixes

66.32.123.29 commented on Wikipedia:Pages needing attention that the origins of the prefixes for terabyte, yottabyte and zettabyte are in some doubt. Particularly, whether 'tera' is derived from the Greek 'teras' (monster), or Greek 'tetra' (four), and similarly, I assume, whether 'zetta' is from the Latin alphabet 'zeta' or a distortion of Latin 'septem' (seven). Anyhow, dictionary.com has septem and octo roots for zetta and yotta, but another site claims that zetta and yotta were used due to a decision by the General Conference of Weights and Measures to use descending letters from the Latin alphabet, starting at the end (zeta). Anyone know the correct etymology for these? -- Wapcaplet 17:05, 3 Apr 2004 (UTC)

  • No, Latin for 4 is quadri. 66.32.113.34 17:18, 3 Apr 2004 (UTC)

Oops, I meant Greek. -- Wapcaplet 17:22, 3 Apr 2004 (UTC)

In the history, as of this point, it keeps changing. An edit on July 5 by Heron says "it's from teras not tetra", but then, on August 2, 209.6.214.139 changed it back to saying it's from tetra. 66.245.22.210 16:47, 3 Aug 2004 (UTC)

[edit] Objection to "Tradition"

"This difference arises from a conflict between the long standing tradition of using binary prefixes and base 2 in the computer world, and the more popular and intuitive decimal (SI) standard adopted widely in the industry."

This is not a matter of tradition. Base 2 is fundamental to the physical/logical construction of binary computing devices, whereas the 'inutitive' decimal system was a convenience adopted by marketeers. This has been noted in numerous reviews of computing hardware (citations are needed), usually during periods where hardware capability measurement transitioned from Kilo to Mega and Mega to Giga units (whether in terms of storage capacity or bandwidth). The reason (again noted in such articles) was that rounding Base 2 units down to decimal units invariably results in 'more bangs for the buck', which is clearly advantageous in regulatory regimes where comparative advertising is permitted, and more generally where levels of competition (and therefore advertising) are high.

[edit] Raylu's opinion

I think it would be more convienient if we deleted all the pages and put them toghether (in a new one), as much of the content on the pages is similar or the exact same. --raylu 22:56, May 12, 2004 (UTC)

  • What pages; can you make a complete list?? 66.245.99.122 22:57, 12 May 2004 (UTC)
    • Sorry I wasn't more clear and for the slow response. I meant the pages like kilobyte, megabyte, etc. raylu 03:58, August 13, 2005 (UTC)
We talked about it on Talk:Binary prefix#Vote_vote_vote.21, but decided to make a navigation template instead, since articles like megabyte are large compared to exabyte. - Omegatron 04:27, August 13, 2005 (UTC)

[edit] "A typical video store contains about 8 terabytes of video. The books in the largest library in the world, the U.S. Library of Congress, contain about 20 terabytes of text."

Does this take into account compression? Both video and text can be compressed, the latter especially. Text compresses extremely well. 68.203.195.204 01:12, 26 Aug 2004 (UTC)

It shouldn't. The point is to illustrate the magnitude of the amount of data. While compression can reduce the amount of storage space it takes up, the actual amount of data remains the same. --Alexwcovington 08:35, 26 Aug 2004 (UTC)

You are right that the "actual amount of data" remains the same, but the problem is that this number, "the actual amount of data", is unknown and cannot possibly be known in practice. Knowing it would require finding the smallest possible program that can generate the data and proving that it's the smallest. (See algorithmic information theory.) Therefore the anonymous user's objection is quite valid: where do these "8 terabytes" and "20 terabytes" numbers come from? They must be either a measurement of uncompressed data or of some compression (e.g. DEFLATE) of it. --Shibboleth 21:59, 27 Aug 2004 (UTC)

I removed the claim about the "largest library in the world" completey, having changed it already. According to the respective websites, the British Library has more items, but LoC has more shelf space. Either way, I think the "largest" claim needs qualifying if it is to be used and personally I don't think it is very helpful anyway (especially to non-Americans). It would be better to say "more text than the xxx million books in the LoC" or something similar. Bobbis 21:02, 25 Apr 2005 (UTC)

[edit] American trillion, Canadian...

Someone edited this article to say "million million, American trillion". Well, what is it in Canada?? In other words, American trillion, Canadian... 66.245.115.34 21:44, 27 Aug 2004 (UTC)

See trillion. --Shibboleth 21:49, 27 Aug 2004 (UTC)
So, shouldn't it then be "million million (English trillion)" or "million million (short scale trillion)" ? Ian Cairns 23:57, 27 Aug 2004 (UTC)
My last edit tried to address these points. Ian Cairns 12:30, 28 Aug 2004 (UTC)

[edit] European (please write the answer here)

Well, the UK is part of Europe - so this should be 'answers' rather than 'answer'. Alternatively: Continental European (please write the answer here). Ian Cairns 01:46, 28 Aug 2004 (UTC)

My last edit tried to address these points. Ian Cairns 12:30, 28 Aug 2004 (UTC)

[edit] Size of Wikipedia

How large is the entire database of Wikipedia in terabytes as of this moment?? 66.32.244.146 02:34, 1 Nov 2004 (UTC)

According to Wikipedia Statistics the grand total for all Wikipedias is just 2.3 Gigabytes. --Alexwcovington 09:16, 1 Nov 2004 (UTC)

The stats says now that on the 13 July 2005, Wikipedia is 4.1 Gb. 159753 20:27, 8 November 2005 (UTC)

The latest numbers on the above link are for Sep 2006 Wikipedia is at 15 Gb. No new statistics have been updated since Oct 2006. Maetrix 19:17, 10 September 2007 (UTC)

You mean GB in the above, right, guys? 71.83.183.109 (talk) 08:26, 8 March 2008 (UTC)

[edit] Are they even related?

How are file sizes and 0's and 1's related? I do not agree with the pages being merged.

They are related in that a Terabyte is a measure of the amount of 0's and 1's. That said, I do agree they should not be merged. A terabyte is certainly a topic that can stand on its own as an article. A centimetre, metre, kilometre, etc... all have their own pages so why shouldn't a terabyte? — oo64eva (AJ) (U | T | C) @ 05:41, Apr 16, 2005 (UTC)
Hence the question mark. All the smaller articles like pebibit should definitely be merged. See Talk:Binary prefix#Consolidate all the little articles and Talk:Binary prefix#We need a template for the little articles. - Omegatron 12:29, Apr 16, 2005 (UTC)
Oh. Someone changed the merge template. It used to have a question mark after it for the bigger articles like megabyte. *sigh*... - Omegatron 12:31, Apr 16, 2005 (UTC)

[edit] Terabyte or Terrabyte

Both versions are used on the same page! Which is correct?

Fixed Ian Cairns 10:06, 11 Jun 2005 (UTC)

"Terabyte" is correct. 67.183.207.37 05:55, 12 June 2006 (UTC)

[edit] Gigabytes first

It's easier for the laymen to understand TB if we compare it to GB first.--Capsela 14:30, 12 January 2006 (UTC)

[edit] Cite for the UPS thing?

"The shipping company UPS has approximately 474 terabytes of information in its databases."

There's no way that could possibly be true, is there? PSXer 03:56, 18 May 2006 (UTC)

Ah, I've been googling around, and it looks like that figure could refer to the total hard drive space on all their computers (of which they apparently have over a quarter millions), not one gigantic database. Of course, quite a bit of that would probably be empty space, as well as a lot of redundant data on each computer such as OS files. PSXer 04:09, 18 May 2006 (UTC)

Or backups. — SheeEttin {T/C} 18:25, 18 June 2006 (UTC)

[edit] Advertising

Is the definition that's just a link to a web design company by that name at all allowable? I shouldn't think so. I'm going to remove it shortly if nobody objects. Maybe a mod could tell that user to stop the ads? GrubLord 11:33, 27 September 2006 (UTC)

[edit] Removed Maxtor 1TB CompUSA reference

I removed the reference to the Maxtor 1TB external disk since it's actually two 500GB disks in a RAID 0 or RAID 1 configuration. This is a very common type of product.


[edit] Hard Drive Growth

4 terabytes Nov 4th hittachy

[edit] Terminology

It would help computer users like me, if we handled the difference between GB as reported by system software like Windows (i.e., 2^30 = 1,073,741,824) and GB as reported by many device manufacturens (i.e., 10 ^ 9 = 1,000,000,000).

Unfamiliar terms like GiB and TiB do not help much. Being too precise loses much of our audience. --Uncle Ed (talk) 18:20, 20 December 2007 (UTC)

The point is discussed in Gigabyte#Consumer_confusion. Is that what you mean? Thunderbird2 (talk) 18:52, 20 December 2007 (UTC)

Yes, exactly! In fact, I'd also like to have a table which compares Giga BITS vs. Giga BYTES. When uploading and downloading large files, people want to estimate how long it will take.

For example, if my AVI video file is 12 GB (gigabytes) in size, how long will it take to transmit on a 100Mb (megabits per second) computer network? --Uncle Ed (talk) 19:14, 20 December 2007 (UTC)

It's fairly safe to assume that a megabit as one million bits, but a gigabyte is anyone's guess. I suppose that's your point, right? How do you think the article can be improved though? Thunderbird2 (talk) 22:30, 20 December 2007 (UTC)

For one thing, remove claims that tebibyte et al. is in wide use. *This amount is now known instead as a tebibyte, to avoid confusion.

Secondly, let's collect the various charts such as the one at Binary prefix which compare byte prefixes based on 2^10 (1,024) with those based on 10^3 (1,000). I'm getting tired of re-inventing the wheel here. --Uncle Ed (talk) 02:38, 21 December 2007 (UTC)

Where do you see the words "This amount is now known instead as a tebibyte, to avoid confusion"? —Preceding unsigned comment added by Thunderbird2 (talkcontribs) 08:06, 21 December 2007 (UTC)
Oh, I see now - you had already removed them. I have reinstated a weaker version of the claim. Thunderbird2 (talk) 08:16, 21 December 2007 (UTC)

Thanks, T-bird. Your wording is much better than my deletion. Merry Xmas! --Uncle Ed (talk) 01:32, 22 December 2007 (UTC)

Merry Christmas to you too :-) By the way, have you seen this? Thunderbird2 (talk) 18:52, 25 December 2007 (UTC)

I made some further edits to indicate that both usages (SI and binary) are still in use; someone had edited it as if SI was the last word and everyone had ceased using the binary definition. (Yes, that would be nice, and it would be nice if the "tebi" style prefixes would come into use, but it's not the reality. I also addded that operating systems typically report the binary version. 58.108.76.51 (talk) 01:08, 23 April 2008 (UTC)

[edit] the terabyte is not a unit of bandwidth

Blahman2 has introduced the text.

  • dreamhost offers its hosting customers 5 terabytes of bandwidth and they increase it per week for each individual.

I removed the text once because it doesn't make sense (the terabyte is not a unit of bandwidth) and was reverted by Blahman2. I then tried to improve it (by replacing TB with TB/s in the hope that this was the intended meaning) and was reverted again. Do others have suggestions for improving the text? I still think it should be deleted. Thunderbird2 (talk) 23:16, 3 March 2008 (UTC)

Try per month. It doesn't represent a top instantaneous speed, but a "if you're running at full speed all the time, we're gonna charge you more" limit. Is that a bandwidth measurement? Well, it has the right dimensions anyway. 5T/s would be really fast! --tcsetattr (talk / contribs) 09:52, 7 March 2008 (UTC)
Bandwidth is this context is a rate or speed meaning it is "something" over "a period of time". In this context the term "terabyte" on its own is therefore not bandwidth. The ISP may use the the shorthand of a size without specifying the period of time explicitly but the period of time is implied, in this case it is most likely a month. So the term terabyte/month would be more accurate, but it would also look messy :) . Fnagaton 12:57, 7 March 2008 (UTC)
Five terabytes per month is only 2 megabytes per second. I don't think that a bandwidth of 2 MB/s merits a mention on the terabyte page. Therefore it should be deleted. Thunderbird2 (talk) 16:11, 7 March 2008 (UTC)
Why is "per second" the most important measurement? That little factoid is about a policy which doesn't apply per second; it applies per month. Converting it to any other time frame makes it meaningless. --tcsetattr (talk / contribs) 21:10, 7 March 2008 (UTC)
The bandwidth article quotes all speeds in multiples of bits per second (kbps, Mbps). You say it (the 5 TB/month) is not a "top instantaneous speed". That seems to me to be tantamount to stating that it is not a bandwidth. Just another reason for deleting the dreamhost item. So what quantity is it that equals 5 terabytes per month? Thunderbird2 (talk) 22:17, 7 March 2008 (UTC)
I've not done business with dreamhost (or even bothered to research the statement - I'm not the one who put it in the article in the first place, and in fact I didn't even look at the article until right now, I was just answering Thunderbird2's question). But you seem to be disputing the word "bandwidth" as it's used by this particular industry. Perhaps you should take a look at Bandwidth (computers)#Bandwidth in web hosting where this newer usage of the word is described. --tcsetattr (talk / contribs) 23:11, 7 March 2008 (UTC)
Ah, now I understand. The link says it should be referred to as "monthly data transfer", so I will change it to that. Thanks Thunderbird2 (talk) 23:23, 7 March 2008 (UTC)

[edit] Promotion issue?

It strikes me that the inclusion of this point in the article comes across as promotional (less so now than the original, but still). Do others not think so? Other points in the section seem to either involve non-commercial entities, or make claims that wouldn't really be used in marketing—no-one is going to say "wow, Walmart have a huge data warehouse, must use them!" SamBC(talk) 18:55, 10 March 2008 (UTC)

[edit] Data transfer caps

The real point is that the term "bandwidth" has fallen into widespread misuse amongst hosting provider's promotional materials. It is very common to see such claims as "5 TB bandwidth included" to mean "extra charges commence after the first five terabytes transferred each month" or "your service may be suspended after the first five terabytes transferred each month". This says almost nothing about the actual data rates during transfer (peak or sustained). It could imply 50 seconds of service at 10 gigabits/second over each of ten simultaneous connections, or much longer service over less hungry connections.

Any datacomm engineer would cringe even at conflating data rate with bandwidth (compare baud). In North America, the old 300 baud modems were often configured to use 8 data bits, a start bit, a stop bit, and a parity bit, so they had an effective peak data rate of 300 * 8 / (8+3) = 218.18 bits per second. Yet they occupied most of a 3kHz telephony channel's bandwidth in the band between 300 and 3300 Hz. When 1200 baud modems were introduced they changed symbol encoding but still used the same bandwidth. At the time it was widely believed even amongst fairly sophisticated users that Shannon's law and the Nyquist criteria meant there would never be more than 6 kilobits per second possible over a telephone line (total up and downstream). Ultimately designers were able to approach 56 kilobits per second over the same 3kHz bandwidth by using more and more sophisticated symbol encoding and error correction schemes to trade off increased raw bit error rates for reduced signal to noise margin in each separately corrected sub-channel.LeadSongDog (talk) 15:53, 14 March 2008 (UTC)

[edit] Mac Pro comment

I'm removing it, as it doesn't need to be in the article. Any computer can have 4TB worth of 4 harddrives. 68.154.166.169 (talk) 14:34, 14 April 2008 (UTC)

I agree. 200.192.77.252 (talk) 20:56, 16 April 2008 (UTC)