Wikipedia talk:Manual of Style (dates and numbers)/Archive 22

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Contents

Non-base ten numbers

We currently do not have a guideline on how to write numbers that are not decimal (base 10), but hexadecimal (16), octal (8) or binary (2), to name the most frequent different bases. Do we need one? If so, which style?

I am using 0x00 for hexdecimal code points in general, but have also seen most of x00, h00 or 00h, $00, #00, HEX00 or hex00 and 0016 or 1600 in Wikipedia. (Luckily not in articles I edited so far.) Subscript indices are certainly the most flexible solution, but also the most cumbersome to write, second is HEX, DEC, OCT, BIN (upper, lower or caps case) and h, d, o, b, which do not pose a problem when copied to plain text.

I usually use uppercase letters A–F (10–15) for the additional hexadecimal digits and would also use these with other bases larger than ten, if I needed to, although duodecimal (base 12) also commonly uses X (10) and # (11). Christoph Päper 8 July 2005 03:27 (UTC)

When I expanded the base (mathematics) article a while back, I used the 1238 style because I needed that flexibility, and because it was the only non-computer way I knew. A style for that wouldn't be a bad idea - what do you think it should be? (Maybe different contexts call for different notations?) Neonumbers 8 July 2005 10:29 (UTC)
0x prefixing is "C notation". It is fairly widely used in programming languages that are (or were originally) built on C, like Perl, Python, and Java, and is often adopted in more general texts. I've used it myself in prose, but also made sure to add a note explaining the convention. Other times I just put something like "2C (hex)". Subscripts seem to be the preferred format in mathematical formulas, from what I remember reading, and I've seen them in prose, but when they appear in prose or in tables (like character code charts), it's terribly distracting, especially if it is used more than once. Someone recently tried to use a subscripted "HEX" on a bunch of code value ranges in one article I watch, and it looked horrible. Even if the subscripts were made extra small, I don't think they should be recommended in prose, ever. — mjb 8 July 2005 19:14 (UTC)
What I had in mind was, in articles about programming etc., use C notation, and in all other articles, use that subscript notation that you (mjb) advise against; but in articles that use lots of only one type of number, just say so at the beginning of the article and let that be the only notice for that article. Same goes for sections and tables. Neonumbers 9 July 2005 12:17 (UTC)
I agree. In articles in which the numbers are binary-related, use the C prefixes (0x, 0b, uhh... what is the prefix for octal?) For general descriptions of numbers, in which the base could just as easily be 12 or 37, use subscript notation of some type. - Omegatron 18:55, July 11, 2005 (UTC)
For octal it's just "0", which might be confusing. – Smyth\talk 10:58, 12 July 2005 (UTC)

Possible addition to MoS on this topic:

For numbers not in base ten,
  • in computer-related articles, use the C programming language prefixes, that is, 0x for hexadecimal, 0 for octal and 0b for binary
  • in all other articles, use subscript notation, for example 1379, 2416, 2X9#12

Please check that the C prefixes I wrote are correct. Any changes? Neonumbers 01:46, 14 July 2005 (UTC)

I like it. 0 for octal is pretty confusing/ambiguous, though. It's only used in a few articles, I'm sure, so maybe a note should be made on each page where it's used? - Omegatron 04:00, July 14, 2005 (UTC)
After thinking about it, I'm in favor of prefixed, lowercase ‘b’, ‘o’, ‘d’, ‘h’, where that suffices. They are short, easy to type and their meaning can be guessed easily. Otherwise, which should be rare, use decimal subscripts. Use uppercase letters A–Z for d10–35, if necessary, i.e. A–F in hexadecimal. Christoph Päper 04:26, 17 July 2005 (UTC)
Well, whatever's convention... Neonumbers 01:20, 18 July 2005 (UTC)
For numbers in bases other than base ten,
  • in computer-related articles, use the C programming language prefixes, that is, 0x for hexadecimal, 0 for octal and 0b for binary. It may be a good idea to include a note at the top of the page about these prefixes.
  • in all other articles, use subscript notation, for example 1379, 2416, 2X9#12, A87D16 (use <sub> and </sub>)
  • For bases eleven and higher, use whatever symbols are convention for that base. Where applicable, use uppercase rather than lowercase letters, thus 0x5AB3 not 0x5ab3.
I haven't changed the part about computer-related prefixes, could someone in that field please advise of convention.
If there are no objections on or before 25 July 2005 UTC, I will add the above text to the Manual of Style. Neonumbers 11:17, 19 July 2005 (UTC)

Unit Disagreement, MiB vs. MB

Discussion moved from the Village Pump - Omegatron 23:05, July 9, 2005 (UTC)

What unit types should be used when describing storage capacity in articles?

v  d  e
Quantities of bytes
SI prefixes Binary prefixes
Name
(Symbol)
Standard
SI
Alternate
Use
Name
(Symbol)
Value
kilobyte (kB) 103 = 10001 210 kibibyte (KiB) 210
megabyte (MB) 106 = 10002 220 mebibyte (MiB) 220
gigabyte (GB) 109 = 10003 230 gibibyte (GiB) 230
terabyte (TB) 1012 = 10004 240 tebibyte (TiB) 240
petabyte (PB) 1015 = 10005 250 pebibyte (PiB) 250
exabyte (EB) 1018 = 10006 260 exbibyte (EiB) 260
zettabyte (ZB) 1021 = 10007 270 zebibyte (ZiB) 270
yottabyte (YB) 1024 = 10008 280 yobibyte (YiB) 280

A problem has arisen in different related articles on whether to use the MB or MiB. Some articles have decided to stick with using MB, some have chosen to use MiB.

Talk:PlayStation_3#Memory_prefixes
Talk:Xbox_360#Mib_v._MB

What is the difference?

MB uses the SI decimal (base ten) system, but computers use 1 or 0, a binary (base two) system. Binary 2^10 (1024) is almost equal to the decimal 10^3 (1 000) so early on 1024 bytes was referred to as a kilobyte. This is only a 2.4% difference; however at larger scales, such as exabytes, the difference is near 20%. Depending on what computer component is being talked about MB may mean 1,000,000 bytes or 1,048,576 bytes.

http://physics.nist.gov/cuu/Units/binary.html
http://www.iec.ch/zone/si/si_bytes.htm
http://en.wikipedia.org/wiki/Mebibyte

Argument in favor of KB, MB, GB.

  • Manufacturers usually post system specifications using these terms.
  • These terms are generally understood by computer professionals as to what MB is being used.
  • Consumers are more familiar with MB and would be confused by other terms.

Argument if favor of KiB, MiB, GiB.

  • MiB is a recognized standard and technically the correct term to use.
  • MiB can reduce confusion as it explicitly states whether binary or decimal capacities are being discussed.
  • MiB is gaining more acceptance and over time will be a more familiar term.

The above is brought to us by User:Thax, who forgot to sign.

Personally, I prefer to use the more familiar MB (NIST be damned :-). That said, you might consider using the approach often used with the also ambiguous billion, which would be to add (106 bytes) or (220 bytes) following the first usage depending on which is intended. Dragons flight July 7, 2005 22:02 (UTC)
Thank you for your speedy response. Do you think that it would be something worth putting to a vote? Do you think enough people even care about this issue? --Thax 8 July 2005 03:05 (UTC)
No, nobody cares, and anyway the result would be that we should go with MB but change things when (and IF) MiB becomes more common. Until I saw this writeup I didn't even KNOW this MiB existed. That is a significant thing, considering I've been downloading from the internet since 1996. But let's look at the figures shall we? Googles I mean, of course... :)
  • MB: 86,300,000 >> MiB: 2,580,000
  • KB: 60,700,000 >>> KiB: 1,070,000
  • GB: 37,900,000 >>> GiB: 1,140,000
Looks like the i loses. By at least a whole digit or more. Master Thief GarrettTalk 8 July 2005 08:15 (UTC)

The only reason anyone might care is if they already know the difference. What is important is reporting the actual capacity accurately, and where necessary pointing out whether the manufacturer's labelling is inaccurate and/or misleading. The best idea would likely be to quote the manufacturer's specifications exactly and annotate this with a more accurate figure if you have it to hand.

A big reason for confusion is that sometimes the decimal and binary multipliers are mixed up: an example would be mis-stating a megabyte as 1,000 kilobytes: if you then move onward towards gigabytes the error can be compounded. Trying to determine how many bytes a given storage device might be capable of storing can be an exercise in frustration.

HTH HAND —Phil | Talk July 8, 2005 09:27 (UTC)

Note that SI symbols are case sensitive. Prefixes up to 'k' are lower case. Prefixes for 'M' and beyond are UPPER CASE. Thus 'kB' not 'KB'. Similarly, 'km' and 'kg', not 'Km' and 'Kg'. Bobblewik  (talk) 8 July 2005 12:45 (UTC)
"No, nobody cares"
Oh yes we do. ;-)
Standards and accuracy are more important than tradition. We aren't going to start using feet instead of meters just because it gets more Google hits.
I say we use the IEC prefixes, and when using the SI prefixes, it should be mentioned which way they are being used, since they are ambiguous.
Related policies: Wikipedia:Manual of Style (dates and numbers)#Style for numbers.2C weights.2C and measures - Omegatron July 8, 2005 13:05 (UTC)
Good point about caring. I care too. Note that SI prefixes are not ambiguous. SI is just about the only thing in the world of units that we all agree has just one meaning. Our uncertainty when reading memory size specifications is not due to ambiguity in SI. It is due to some people to using prefixes incorrectly. Bobblewik  (talk) 8 July 2005 13:34 (UTC)
Correction, accuracy is more important than both tradition and standards. However, I don't think this is something that needs to be decided as a policy one way or the other. Nine point nine times out of ten, "kilobyte" will be used with the binary meaning, and this is well enough established that confusion is unlikely. On the other hand, there's no point in going around changing articles that use "kibibyte", since those who have never heard of it can simply look it up in the nearest available encyclopedia. :)
If there is to be a policy, I suggest that it's the same one as with US/UK spelling: be consistent within an article, but respect the original author's choice, and don't change an entire article just for the sake of changing. Only if there is a real risk of confusion (as with words meaning different things in the US/UK), is an explicit definition needed. – Smyth\talk 8 July 2005 13:52 (UTC)

Ehhhhhhh. It's not the same as spelling. Everyone knows that center = centre. Not everyone knows that a CD MB (1,048,576) and a DVD MB (1,000,000) are different. I agree that accuracy > standards > tradition. Standards and accuracy go hand-in-hand, though. I would much prefer "unless otherwise noted, kB in the Wikipedia means 1000 bytes", but if you want to go through and add "(decimal meaning)"/"(binary meaning)" after every instance of the word "KB" to maintain accuracy, go for it. - Omegatron July 8, 2005 14:21 (UTC)

Omegatron, that's not very realistic. - Omegatron
A CD MB and a DVD MB are the same. If you're comparing the capacities of the discs directly then you have to use one or the other. It's just that the discs differ in which one is traditionally used to give capacities. As for "unless otherwise noted, kB in the Wikipedia means 1000 bytes", well that is surely a false statement right now, and would require a vast amount of work to make it true, and a continuous patrol to correct those who weren't aware of it. – Smyth\talk 8 July 2005 14:30 (UTC)
The main problem that I noticed isn't due to the number of articles that use one term or the other, but rather the unit disagreement on related articles managed by different people. For example in the PS3 page the talk decision was made to use MiB, but on the xbox 360 page the decision was made to use MB. For every new page that someone may want to convert to the technically correct term there needs to be a large discussion started on the merits and pitfalls of using one unit or the other. Making the decision on one location would help speed up this process and bring consistancy to related articles. --Thax 8 July 2005 16:05 (UTC)
Well I've already put my two cents in on the PlayStation 3 talk page. I think it's fallacy to say "stick with tradition until more people start using correct terms" because by that line of reasoning, we'll be incorrectly applying SI prefixes for decades to come. Why not start now? Most readers will simply ignore the 'i' in 'KiB', 'MiB', etc and read 'KB' and 'MB', respectively. Those that do notice the difference enough to wonder what it means can click the wikilink and, *gasp* learn something new (whenever I use IEC binary prefixes, I link their first instance to the article explaining their usage). I've obviously made a fuss about keeping IEC binary prefixes on some pages I watch, but only because I really believe we should all be moving towards the technically correct prefixes now that they are standardized, and what better place to start than an encyclopedia? -- uberpenguin July 8, 2005 15:13 (UTC)
It seems logical to me that if MiB is here to stay in some articles, since it is the technically correct term to use it would not be possible to make a policy decision to choose MB or MiB on all cases. For example there may be articles where the capacities discussion is very importance and needs to use MiB and MB to be specific.
Therefore it seems to me that there are the following choices:
1. The use of MiB is required in all articles.
2. The use of MiB is recommended in all articles.
3. The use of MiB or MB should be decided on a page by page basis. (No policy)--Thax 8 July 2005 16:13 (UTC)

It seems that what got this whole discussion started in the first place is that many people object to changing MB to MiB in existing articles, on the grounds that it's too obscure. They have a point. However, if, as you say, the exact capacity is important and there's any chance of confusion, then MiB is probably preferable. This does not mean that MB would then be declared to always mean 10^6. The decimal meaning is so rare that its use should always be explicitly declared. – Smyth\talk 8 July 2005 16:19 (UTC)

Agreed, I don't think that the MB should be declared to always mean 10^6, that would be wrong. The main point of the policy decision would allow people to fix related articles to use the same units without needing to duke it out in the discussion page. My guess would be that the decimal meaning happens about 50% of the time, for example Hard Drives use the decimal meaning, while memory uses the binary meaning.--Thax 8 July 2005 16:47 (UTC)
That's all well and good but it still doesn't address what should happen when some editor changes correct MiB references to MB due to personal preference, and other authors (such as myself) wish to leave the references with the binary prefixes for their own valid reasons. Should this just go on being resolved on a case-by-case basis. If so this will surely continue to come up until eventually a heated argument will cause some case to go to arbitration when two parties can't reach an agreement. I thought it would be nice to try and at least set some loose guidelines on the usage of the SI vs IEC binary prefixes for data capacities... -- uberpenguin July 8, 2005 17:15 (UTC)
I agree with this as well. Personally I think that MiB should be a recommended option, this approach seems to work best for all parties involved.
The use of the binary prefixes, such as MiB, shall be preferred over ambiguous SI decimal references. The use of the new binary prefix standards are not required but are recommended for use on all articles where binary capacities are used. If a contributor changes an article with a binary capacity reference to use the more accurate binary system, that change should be accepted over an ambiguous application of the SI decimal system.
Does this sit well with everyone, or do we need to put this to a vote?--Thax 8 July 2005 18:27 (UTC)
That's a good idea. Instead of voting, someone start a proposed policy page, stick a {{proposed}} tag on it, start linking to it every time you change a unit, and it will evolve until we get a consensus and it becomes a guideline.
I don't see what's wrong with linking every instance of MiB. It's not terribly distracting. - Omegatron July 8, 2005 19:23 (UTC)
Agreed on both counts. – Smyth\talk 8 July 2005 19:39 (UTC)

I may just be out in left field on this, but I would rather not be in the position of saying that MiB is "preferred". I would rather distinguish between cases where the technical distinction is important and cases where the usage is incidental. For example, an article of CD-ROM format specifications, were such to exist, clearly cares about MiB vs. MB. Whereas an article on computer simulated cosmology doesn't really care whether the simulation occupied 1 TB or 1 TiB. The latter, even if technically correct, distracts from the flow by presenting a term unfamiliar to most English readers in a context where it is basically irrelevant. Also, this doesn't address what to do with storage capacities that really are 106 bytes. Obviously we can't define 1 MB = 106 bytes since the real world doesn't consistently use it that way. So, should we start talking about 0.96 MiB? And what about the even more awful 1,024,000 bytes? I would propose instead a guideline to read something like the following:

  • In most circustances, the common english designations kB, MB, GB, etc. should be preferred if the precise specification is unknown or is largely irrelevant to the reader's understanding the article.
    • Examples include:
      1. The capacity of a particular computer model when used in articles only incidentally mentioning that model's result.
      2. Estimates of the amount of information collected by the spy satellites each day.
  • In cases where the precise specifications are known, but are likely to be of interest to only a few readers, rather than most, editors are encouraged to parenthetically write out the intended meaning the first time it is used: e.g. "4 MB (4*106 bytes)" or "4 MB (4*220 bytes)" or "4 MB (4,096,000 bytes)". In this case, it may also be appropriate to write "4 MiB (4*220 bytes)", if the device's storage capacity is routinely expressed as a multiple of a binary power.
    • Examples include:
      1. The storage capacity of most consumer electronic devices, unless data storage is a major part of the discussion.
      2. The size of most software packages.
  • Lastly, technical articles, where the precise number of bytes is likely to be of interest to most readers, are encouraged to use KiB, MiB, GiB, etc. throughout.
    • Examples include:
      1. Detailed discussions of storage formats or compression algorithms.
      2. Discussions of devices focusing on storage capacity or comparing storage capacity between many similar devices.

I don't expect that everyone will agree with this, but this summarizes how I would want to approach the problem. Dragons flight July 8, 2005 21:27 (UTC)

I like it. – Smyth\talk 8 July 2005 21:52 (UTC)
What's wrong with listing something as MB or MiB? If you want to know what the value is, follow the link. The link should make things clear, right? Vegaswikian 8 July 2005 21:56 (UTC)
MiB is always clear, but MB never is since industry groups use it interchangably to mean either 1,000,000 bytes, 2^20 = 1,048,576 bytes, or 1,024,000 bytes. Dragons flight July 8, 2005 22:11 (UTC)
Okay I tried to summarize everyones ideas and viewpoints the best I could at Wikipedia:Manual of Style (dates and numbers)#Binary unit prefixes. Please tweak it as required. If it is in the wrong place please move it, I am a newb and just trying my best at doing the right thing. What should we do with this discussion, is it bad etiquette to copy and paste everyones comments to a different location? --Thax 20:02, 9 July 2005 (UTC)
It is ok to move discussions; especially from the Village Pump, where they will be archived (effectively deleted) after a short time period.
I moved it here. - Omegatron 23:05, July 9, 2005 (UTC)

Such changes are meant to get consensus before being published on a project page, that means voting unless clearly everyone in the discussion is thinking exactly the same thing. I didn't really see much of a consensus there; correct me if I'm wrong. So if there isn't one clear consensus (or it is evident that consensus will never be reached), that section shouldn't be there at all - yet. Neonumbers 10:56, 10 July 2005 (UTC)

We must not imply that SI prefixes are ambiguous. They are not ambiguous. SI was specifically created to eliminate ambiguity in unit terminology. It is about the most unambiguous thing that anyone in the world can use with respect to units. People may use SI prefixes incorrectly, but that does not mean that SI prefixes have more than one definition.
For example, the phrase the SI prefix was used in a binary sense implies that SI prefixes have more than one interpretation. Similarly, the MiB reference is less ambiguous implies that SI prefixes are more ambiguous.
Apart from that, I welcome the attention that is being paid to this issue. Bobblewik  (talk) 11:31, 10 July 2005 (UTC)
I'm sorry, but "MB" is somewhat ambiguous in a way that "MiB" is not. Yes, from an official point of view, using M to mean 2^20 is utterly and totally incorrect, but from a neutral point of view that is exactly what people do, and calling an overwhelmingly common use "incorrect" is being pedantic. Look at Nucular – I was shocked when I saw it, but language changes, whether we want it to or not. – Smyth\talk 12:52, 10 July 2005 (UTC)
I disagree. Language changes, but measurements, units, and SI prefixes are not language. We're not going to redefine the definition of the meter so that the speed of light works out to exactly 3×108, just because everyone says it that way. - Omegatron 13:07, July 10, 2005 (UTC)
Your speed of light example is totally off-base. It is, in fact, correct to say, when you include the units, that the speed of light is 3×108 m. Not only that, but it is also correct to say that the speed of light is 3.0×108 m or 3.00times;108 m. That number is accurate to three significant digits and then some, with a relative error of one part in 1444. In other words, it isn't a "different" number, it is simply the same number expressed to less precision. Gene Nygaard 13:39, 10 July 2005 (UTC)
If someone says the speed of light is 3×10^8 m/s, they know they are making an approximation. If they say a CD has a capacity of 702 MB, they know what they mean by "M", and it's not 1,000,000. The prefixes are part of the language of the computer world, but of course I'm not suggesting that SI prefixes in general are ambiguous. This is just a special case. – Smyth\talk 13:16, 10 July 2005 (UTC)

I agree that binary prefixes are not misused as much as SI prefixes. So if I see '2 MiB', I may be more certain about what the author meant than if I see '2 MB'. I agree that we should highlight the uncertainty. It is definitely worth mentioning ambiguity and uncertainty. I merely want to avoid using phrases that attribute ambiguity and uncertainty to the SI standard itself. This can be achieved easily by a slight change in wording. Bobblewik  (talk) 13:45, 10 July 2005 (UTC)

What I'm saying is that language evolves and changes due to common usage. Defined concepts and quantities don't. Just because everyone uses "power" and "energy" interchangeably doesn't mean that power is no longer the rate of change of energy; the words are simply being used incorrectly.
"If someone says the speed of light is 3×10^8 m/s, they know they are making an approximation."
Not necessarily.
"they know what they mean by "M", and it's not 1,000,000."
Not necessarily. - Omegatron 15:30, July 10, 2005 (UTC)
Concepts remain unchanged, but words used for them often gain new meanings. Anyone trying to understand a physics paper without knowing what physicists mean by "power" and "energy" will be confused. Anyone trying to understand the specification of a RAM chip without knowing what computer people mean by "kilo" and "mega" in different contexts will be similarly confused. Anyway, this is getting away from the original question, which is:
Should people object to "MB" being changed to "MiB", where the latter is factually correct?
The only people who have objected so far have done so either because "MiB" is too obscure (fixed by a simple wikilink), or becaue Microsoft/Sony/whoever did not use the IEC terminology in their own documents, which I don't think is of any relevance to us. – Smyth\talk 17:37, 10 July 2005 (UTC)
Agreed, those are only objections I have noted this far as well, with the exception of people who may not understand the proper application of the binary units. I think most objections initially start because people are unfamiliar with the term and prefer to stick with what they are personally familiar with. --Thax 03:33, 11 July 2005 (UTC)
FWIW, I think the arguments for MB at the start of this discussion have more merit. It's the unit used by the manufacturers and therefore the most used term and the most understood by laymen. Two arguments specifically mentioned in the WP naming conventions. - Mgm|(talk) 08:49, July 11, 2005 (UTC)
"Nine point nine times out of ten, "kilobyte" will be used with the binary meaning"
"The decimal meaning is so rare that its use should always be explicitly declared."
I disagree. Talk:Binary prefix#Various_references has clearly established that proper use of SI prefixes to express decimal capacities is more prevalent than binary. There are only 3 main offenders: RAM capacities, CD capacities, capacities reported by OS (Windows in particular). The last one creating most problems. Delicates 13:31, 11 July 2005 (UTC)

For "OS" read "virtually every program on any operating system that describes byte sizes". If we're going to talk about what the average user will be familiar with, that's a fairly important fact. And I don't know why you pick out Windows; on Linux, ls, dd and du all use binary multiples. As is conventional, iptables uses decimal multiples when counting packets or bytes, but higher-level networking programs like web browsers or p2p clients will use binary multiples on all platforms. – Smyth\talk 14:12, 11 July 2005 (UTC)

I thought I heard the Linux kernel switched to IEC prefixes... - Omegatron 14:55, July 11, 2005 (UTC)
I don't know where the kernel would have need to use any prefixes at all. Anywhere it presents a number, the number is probably unabbreviated. – Smyth\talk 14:04, 12 July 2005 (UTC)
Hmm.. I don't know. This is all I have to go on:
"After a much heated discussion in December 2001 on linux‐kernel mailing list, the binary prefixes have been accepted by key Linux developers, and are now extensively gaining ground across Open Source UNIX applications." [1]
Developer discussion here
I love how computer people say that kibi- is "ugly" or "sounds funny" after coming up with units like "byte" and "nibble". - Omegatron 14:32, July 12, 2005 (UTC)
For what it's worth, the Linux kernel is not consistent at all: in the boot log alone you will find "Kbytes", "kB", and "k". There's bound to be Kib somewhere as well. Rl 14:35, 12 July 2005 (UTC)
This reminds me of the AD/CE dispute last month. Like it or not, similar to AD, Kb is the most widely used and most widely understood term. Similar to CE, Kib is a well-intenioned attempt that has failed to displace the common meme. The average person will know what a kilobyte is by now, but will look strangely at a kibibyte. Radiant_>|< 11:26, July 12, 2005 (UTC)
But "KB" doesn't endorse a Christian POV.  :-) - Omegatron 13:19, July 12, 2005 (UTC)

Vote

Note: No end-date was designated for the vote, but as of July 23, 2005, the votes were:

  • 20 support: "The MoS should encourage the use of the IEC prefixes in all binary-multiple contexts"
  • 1 support: "The MoS should encourage the use of the IEC prefixes only in highly technical contexts"
  • 6 support: "The MoS should discourage the use of IEC prefixes anywhere "
  • 0 support: "Don't mention"
  • 2 support: "No more votes"

The MoS should encourage the use of the IEC prefixes in all binary-multiple contexts

Proposed wording: as it is currently worded (July 9, 2005). (The current wording allows some flexibility "The use of the new binary prefix standards in the Wikipedia is not required, but is recommended" ... "but you can change 512 MB RAM to 512 MiB RAM where it is important to do so.")

  1. Omegatron 16:00, July 12, 2005 (UTC) - This article sums up my opinion perfectly: A plea for sanity.
    On the Wikipedia, accuracy is more important than "common usage" (which isn't really common usage, anyway, outside of computer science classrooms.)
    Even in instances where the units are referring to an approximation, I think the appropriate unit should be used. ("...memory chips in the hundreds of mebibytes...", "...archives use several TB of disk space...")
  2. Pmsyyz 15:42, 12 July 2005 (UTC)
  3. Delicates 16:00, 12 July 2005 (UTC)
  4. Urhixidur 16:09, 2005 July 12 (UTC)
  5. Dpbsmith (talk) 16:28, 12 July 2005 (UTC). The issue, as always, should be serving the reader. Because the "traditional" nomenclature is ambiguous, using only the "traditional" nomenclature never serves the reader well. Leave it up to the discretion of the writer as to whether the terms should be briefly explained on their first appearance within an article, and whether there is any need to provide equivalents in decimal-based units. And, Emerson, foolish consistency, etc. As with British versus American usage there is virtue to consistency within any single article but no compelling need for consistency throughout Wikipedia. It is safe to assume that our target audience includes readers who are both familiar and unfamiliar with the IEC prefixes, and also that our target audience should not have any difficulty understanding the IEC prefixes if they are explained or linked on first occurrence. Dpbsmith (talk) 16:28, 12 July 2005 (UTC)
  6. Thue | talk 16:42, 12 July 2005 (UTC)
  7. Thax 16:44, 12 July 2005 (UTC)
  8. Dragons flight 16:56, July 12, 2005 (UTC) Changing vote. I had misunderstood what was meant by "all appropriate". The current wording seems to be a reasonable compromise, though I would still like something to the effect of "if you don't know or the reader couldn't possibly care, stick with MB, etc."
    • "Appropriate" was a bad wording, sorry. I have changed it to "binary-multiple", since I trust that's how most people understood it, and it is what the current wording says. – Smyth\talk 17:32, 12 July 2005 (UTC)
  9. Grahn 17:16, 12 July 2005 (UTC)
  10. Cburnett 19:29, July 12, 2005 (UTC)
  11. Seems fine as long as the wording is not too strong. Gdr 22:44, 12 July 2005 (UTC)
  12. uberpenguin 01:33, July 13, 2005 (UTC) My opinions are already obvious here, but I'm adding my name for posterity. Just acknowledging that there exists an ambiguity problem isn't sufficient; we need to do something about it rather than just sitting back and resigning ourselves to ignoring the problem until it comes up again. I think the wording should also at least mention the usage of the older term "Kiloword," "Megaword," etc, when referencing (mostly older) computers that did not use an octet byte as the base memory unit. Here again, the IEC binary prefix should be recommended but not required ONLY if it is appropriate (since there are plenty of circumstances when the SI interpretation of "Megaword" -- 10^6 words -- is the correct one).
  13. Lachatdelarue (talk) 14:14, 13 July 2005 (UTC)
  14. Weejee 03:45, 14 July 2005 (UTC)
  15. The binary prefixes have for the past 3-5 years shown up in computer science courses world-wide and can be expected to find their way into common industry use as more new students graduate. Wikipedia can easily adopt new and sensible terminology, because an explanation of what it means is just a click away. It is ridiculous, on the other hand, to confront readers with a convention that says that 1 kbit = 1024 bit whereas 1 kbit/s = 1000 bit/s. Markus Kuhn 15:52, 14 July 2005 (UTC)
  16. D. F. Schmidt (talk) 18:32, 15 July 2005 (UTC) I feel that it shouldn't matter whether something is rendered in MB (106) units or (220) units. The difference is slightly less than 5%. (For gigabytes, the difference is 7.37%) This should be inconsequential to anyone reading this encyclopedia. For one, if someone studies enough, they'll figure out this discrepancy sooner or later, and when they think about it long enough, they'll determine that this whole thing is moot. Why bother caring about it? The SI fan in me remembers the calorie/kilocalorie problem, but this also is inconsequential, because the term used on food product packaging is the only one used on food product packaging. Thus, it doesn't matter if a computer product says '1 GB,' and the actual computer renders that as 109 or \frac{10^9}{2^{30}}=0.931\;10^9. There are at least two reasons for this: the customer cannot do anything about it, and the manufacturer is not willing to do anything about it. But in the interests of science and precision, it should be a policy that base-10 prefixes and base-2 prefixes should be used where applicable.
  17. Bobblewik  (talk) 18:51, 15 July 2005 (UTC)
  18. Dwheeler  Where a specific value for a base-2 multiple is given, you should always use the binary prefixes: MiB, GiB, etc. Where a base-10 multiple is used, or no precision is intended, use the base-10 prefixes, e.g., "many megabytes". Many Americans don't routinely use metric, so the SI prefixes may be less familiar to them, but everyone else "knows" that a kilo is a thousand, a "Mega" is 1.000.000, and so on. The "1.44Mbyte floppy" mixes the binary and decimal units, and transmissions in bytes are often measured where megabytes=1.000.000bytes, so even with bytes the prefixes' meanings are starting to go back to being exclusively decimal prefixes as they were originally intended. As computer hardware becomes more capable the inaccuracies are getting larger. Finally we have an accurate way of expressing these values; we should use them. I'm already seeing MiB in many other places, and there's no reason for Wikipedia to lag behind.
  19. Christoph Päper 04:30, 17 July 2005 (UTC) Everything else is just stupid (as wiki-voting is). Most people will just ignore the i for now.
  20. WLD 22:32, 17 July 2005 (UTC)

The MoS should encourage the use of the IEC prefixes only in highly technical contexts

Proposed wording: first two paragraphs as at present, followed by:

However, the IEC prefixes are still relatively obscure and should not be used in general-interest articles. It is not necessary to expand every use of the SI prefixes if they are only being used approximately, but if the exact value is at all significant then it should be identified explicitly, like this:

  • "... 512 MB (536,870,912 bytes) of RAM, and a 40 GB (40,000,000,000 byte) hard drive."
The IEC prefixes should only be used in highly technical articles where binary multiples are used exclusively, such as Random access memory. On their first appearance, they should be adorned with links to their corresponding articles, like this:
  • "A 512 MiB memory module"
  1. Smyth\talk 13:58, 12 July 2005 (UTC)
  • Dragons flight 16:07, July 12, 2005 (UTC) As per comments above, the use of unfamiliar (even if technically correct) terminology is a distraction in many contexts where the distinction is utterly irrelevant to the readers understanding. I would prefer however that "highly technical" be changed to "technical" and "should only be used" to "are recommended to be used", more flexibility is better.

The MoS should discourage the use of IEC prefixes anywhere

Proposed wording:

However, the IEC prefixes are obscure and should not be used if possible. It is not necessary to expand every use of the SI prefixes if they are only being used approximately, but if the exact value is especially significant then it should be identified explicitly, like this:

  • "... 40 GB (40,000,000,000 byte) hard drive."
  1. Support The prefixes are useless, and there's no line between general interest and technical articles. --Dtcdthingy 17:25, 12 July 2005 (UTC)
  2. Support Wikipedia is an encyclopedia, not an instrument for special interest groups (like IEC) to try to push the way they would like the world to work. We should reflect in the encyclopedia what the world is like, not what we think it should be. The reality is that kilobyte means 1024 bytes most of the time it's used. Many people who use computers (including much of the IT industry) have never heard of a kibibyte and don't use the term. We shouldn't be social engineering.
    Ben Arnold 21:41, 12 July 2005 (UTC)
  3. Support As taught in most universities comp sci departments and as understood by programmers, a kilobyte is understood to be 1024 bytes, as the user above pointed out. --kudz75 00:20, 13 July 2005 (UTC)
  4. Support This seems to be a meaningless topical discussion like arguing if Pluto is a planet. The IEC is trying to make a mountain out of a mole hill here, and chaning terminology is something that should have been done back in the 1950's, not 1990's. Marketing gurus are the real idiots who have mistaken MB to mean 106 anyway, as most real software and computer engineers have no problem with MB == 220 (1,048,576) and GB == 230. This mainly is because a marketing idiot who suddenly discovers that a 40 GB drive (40 x 230 = 42949672960) can be marketed as a 43 GB drive, and not 40 GB, to try and make it seem their product is somehow better than a competitors. From my experience, even hard-core developers who really do development work don't really care about semantic discussions anyway, except for a few purists. --Robert Horning 11:01, 13 July 2005 (UTC)
  5. The standard's not established enough yet. I had never heard of these things before I came to Wikipedia. Neonumbers 11:21, 13 July 2005 (UTC)
  6. ...and I had never heard of these things before it was raised on the Pump, and I've been downloading countless gigs of who-knows-what since 1996. Come back in 2008 when it's an accepted term, or, rather, at which point it's stagnated. Oh, Support, as if it wasn't obvious. :) GarrettTalk 03:22, 16 July 2005 (UTC)

The MoS should not mention this issue at all

No more stupid votes

  1. Voting is not what consensus is about. I'm sick to death of votes being started left and right. m:Polls are evil. -- Cyrius| 23:06, 12 July 2005 (UTC)
  2. Hear hear. — David Remahl 10:36, 13 July 2005 (UTC)

The Wikipedia should only represent common usage

This is ridiculous. There are a few extremely important points that are being ignored here. First, and most importantly, The Manual of Style should reflect common usage on Wikipedia, and not prescribe a usage which is not the common usage'. So no matter if 3 or 5 people vote here that the MoS should "recommend" the IEC prefixes, if that usage is no the common usage on Wikipedia, then it shouldn't be in the MoS. The reality is that the IEC prefixes are extremly obscure, particularly to the lay reader. Second, "oh, we'll just put in a link" is not really an adequate response to that complain. It's not a valid argument for the same reason that many articles include measurements in feet in inches. Third, people are used to kilobytes being 1024 bytes and megabytes being 1024 kilobytes, and even though there are new prefixes that define that explicitly, those prefixes do not enjoy common usage. It doesn't matter if they're official (whatever that means--there is no regulatory authority over the English language). The only thing that matters is common English usage—and with the exception of hard disk manufacturers and a few others, a megabyte almost always means exactly 1,048,567 bytes [2]. Usage on Wikipedia should reflect the common usage, and the MoS should reflect usage on Wikipedia. Nohat 23:24, 12 July 2005 (UTC)

If a styleguide or manual of style was just to reflect common usage, it would be pretty much unnecessary. It reflects best usage. It's about shoulds, not ares, mays or cans, but neither musts. Even (or especially) anarchy needs some ground rules. Christoph Päper 04:37, 17 July 2005 (UTC)
But that's not true. We've been through this before on other talk pages. 1024-bit kilobits are not an overwhelming majority usage. Depends on what field you're talking about.

"The reality is that the IEC prefixes are extremly obscure, particularly to the lay reader."

kilobyte = 1024 bytes is obscure to the lay reader.
Not to anyone who's bothered to look it up in the dictionary, which all give 1024 bytes as the primary if not the only meaning. [3] [4], also the Oxford American Dictionary. Nohat 06:05, 13 July 2005 (UTC)
"Oh, we'll just put in a link" is not really an adequate response to that complain[t].
And believe it or not, this usage is not consistent at all. I searched each of the bolded words, and here are their first hits:
  • "Technically, therefore, a kilobyte is 1,024 bytes, but it is often used loosely as a synonym for 1,000 bytes."[5]
  • " In data communications, a kilobit is a thousand (103) bits." [6]
  • "In the U.S., Kbps stands for kilobits per second (thousands of bits per second)" [7]
  • " In data communications, a megabit is a million binary pulses, or 1,000,000 (that is, 106) pulses (or "bits")." [8]
  • "When used to describe data storage, 1,048,576 (2 to the 20th power) bytes. Megabyte is frequently abbreviated as M or MB. (2) When used to describe data transfer rates, as in MBps, it refers to one million bytes. " [9]
  • "Fast Ethernet to 1000 Mbps, or 1 gigabit per second (Gbps)" [10]
Just because you've never heard of something doesn't mean that it doesn't exist. - Omegatron 12:45, July 13, 2005 (UTC)

Part of the reason for those distinctions is that once you throw in seconds, it is no longer a counted quantity but rather a measured quantity. The binary progression disappears when you add a time factor. Gene Nygaard 12:55, 13 July 2005 (UTC)

Yes. And the binary progression is not even there in the first place for things like disk media - CDs, DVDs, hard drives are all based on cylinders and frames and sectors, not powers of 2.
Actually, memory is really the only thing that is implicitly binary. Bus speeds, clock cycles per calculation, storage sizes, clock rates, communications rates, calculations per second, and everything else are measured, decimal quantities. - Omegatron 13:38, July 13, 2005 (UTC)
And it's not just hard drive manufacturers. Why does everyone always say that? I guess they felt ripped off the first time they bought a hard drive and vowed to never forgive the conspirators.  :-)
More Wikipedians probably say "aluminum" than "aluminium" (Americans), but we stick to the recognized standard when there is one. - Omegatron 23:41, July 12, 2005 (UTC)
No, no, no. The vast majority of times that most people interact with measurements of bytes is every day while they're using their computer looking at directory listings or properties dialogs containing file sizes or measures of free space. We're talking about at least 99% of the times that people deal with any kind of measurement in bytes, if not dramatically more. Perhaps in certain subfields people use the powers of 10 meanings, but the number of people for whom the power of 10 meaning is the most salient is vanishingly small. The rest of the world, ordinary people who are just using their computers, deal with the capacity of hard disks or CD-ROMs or DVDs or flash devices or whatever many orders of magnitude less frequently than they deal with the actual size of files or free space, which is essentially invariably reported by the operating system in powers of 2.
No, it doesn't. One of the reasons this is an issue in the first place is that computer software has never consistently adhered to any particular standard in reporting RAM and disk usage. For example, in the current release of Windows XP the formatting utility refers to the capacity of floppy as "1.44MB", which is neither binary nor decimal (it is the "hybrid" megabyte = 1000 * 1024), and then the same utility, after formatting, refers to the capacity in binary megabytes. It would take some careful research to do a full historical overview of how various versions of various operating systems have handled this, but there is no well-defined, consistent, defacto usage. Dpbsmith (talk) 14:35, 13 July 2005 (UTC)
Two, the "recognized standard" argument will never fly as a justification for a policy. That's just not the way Wikipedia works. There are countless examples on Wikipedia going back years where supposed "standards" are not used because they're obscure or don't represent common usage. Just look at the policy for, say, the titles of articles about biological organisms or foreign cities or the policy for the use of various national spellings. Indeed, aluminium remains at that title NOT because the official IUPAC name is "aluminium", but because of the "use national spelling used by first major contributor to article" policy.
I didn't say it was just hard disk manufacturers ("hard disk manufacturers and a few others, but they are in fact the only significant user of those meanings of megabyte and gigabyte. But as I explained above, the vast majority of usage—by an almost incomprehensible margin—is the size of files as reported by operating systems, which is nearly invariably using the powers of 2 meanings of kilobyte, megabyte, etc. Nohat 05:56, 13 July 2005 (UTC)
"The only' thing that matters is common English usage"
Not true. Accuracy is more important than common usage. The old usage of "kilobyte" to mean both 1000 or 1024 bytes, depending on context, is hopelessly ambiguous, inaccurate, and really frustrating to both computer newcomers and computer-related developers (except comp sci majors, apparently, who are off in their own little world). - Omegatron 12:45, July 13, 2005 (UTC)

No, service to the reader is what matters. In this case, a) common usage is not all that well-define or universal. When a software job mentions a salary of "80K" it does not mean $81,920. b) Common usage is, in fact, confusing. The commonest current practice seems to be to disambiguate "MB" or "GB" by giving the actual number of bytes, e.g. "Used: 9.95 GB on disk (10,683,035,648 bytes)." This would not be necessary if MB and GB were widely understood to have binary meanings. c) Practice b is very awkward. It is convenient to have short abbreviations. d) Regardless of whether the IEC prefixes are in the process of adoption, they are very easily understood even from context. It is not appropriate to push an agenda of using something deemed technically correct or important, but it is perfectly reasonable to use the IEC abbrevations selectively in situations were the alternatives are awkward or long or clumsy or confusing. I would oppose using GiB if GB in fact had an unambiguous and universally-understood meaning. Since it does not, we need to do something besides simply using GB, and GiB seems very reasonable to me. It's short, sweet, easily understood, and has the backing of at least one standards organization. Dpbsmith (talk) 14:35, 13 July 2005 (UTC)

See Talk:Binary_prefix#Organization_recommendations for several organizations that have adopted these units. I'm sure there are more. - Omegatron 14:57, July 13, 2005 (UTC)
I'd imagine that most contexts in Wikipedia where "kilobyte", "megabyte" etc are used fall into two groups:
  • Inexact estimates where the binary/decimal distinction is not significant. In this case, they should be left as "KB", "MB", etc, and the current wording supports that.
  • Exact measurements, in technical specifications and so forth, where the distinction is significant. In this case, they should be changed to whichever of the SI or IEC prefixes makes the exposition easier, and a full expansion given if there is any chance of confusion. Such technical discussion would probably not be read by "ordinary people who are just using their computers", and even they were, "MiB" looks so similar to "MB" that they would probably skip over it without a thought. – Smyth\talk 10:02, 13 July 2005 (UTC)

A prime example of mixed-usage nonsense

See [11] for a perfect example of the confusion this nonsense causes. Even if you replace the binary prefixed megabytes with mebibytes, the equations still aren't correct, since 1 MiB is not equal to 1,024 kB. 1 MiB = 1,048,576 bytes = 1,048.576 kB.

Ugh. I'm going to attempt a fix. - Omegatron 18:54, July 18, 2005 (UTC)