Talk:Windows-1252

From Wikipedia, the free encyclopedia

Why can the codepage Windows-1252 not used on MacOS or Linux? --84.61.69.103 16:10, 8 June 2006 (UTC)

I'm not sure exactly what you mean by that; some software, even in non-Windows operating systems, might recognize this character encoding and properly render documents transmitted or stored using it; however, as its name implies, it's an encoding which was devised by Microsoft for use in Windows, and is not one of the standard, platform-neutral ones such as iso-8859-1 or utf-8. *Dan T.* 16:47, 8 June 2006 (UTC)
Most operating systems use some unicode encoding and/or a small selection of legacy encodings as an internal format. What is supported by the conversion libs and apps that communicate over open protocols is generally a much wider selection and the windows-125x encodings certainly get this level of support on all major platforms. Windows-1252 is also the unofficial default character set of the web (officially its ISO-8859-1 but since the C1 control codes are banned anyway...........) Plugwash 22:42, 8 June 2006 (UTC)
I'm not sure exactly what you mean by that, but I guess this; for example, in Linux you can not mount fat32 volumes with 'iocharset=cp1252', and too often 'locales' usually put things worse. At the end, it's almost impossible to interchange files between Windows and Linux without messing up the filenames, (at least for me, an average Linux user, outside USA).

Is Windows-1252 a proprietary codepage? --84.61.43.60 16:44, 31 August 2006 (UTC)

Well it was introduced by a vendor without any standards body approval but there is nothing preventing anyone who wants to from implementing support for it. As simple statements of facts about an encoding method i'm pretty sure raw codepage tables are not copyrightable and information on the code page is freely availible to anyone who wants it from microsoft themselves, unicode.org and countless other places. Plugwash 16:14, 1 September 2006 (UTC)

Contents

[edit] "AFAIK, only IE and other MS products do this"

The above was an edit reason for the addition of a {{fact}} tag. I don't have a cite but i tried it with IE and firefox on windows and firefox and konqueror on linux and they all did it. Plugwash 15:51, 6 September 2006 (UTC)

Whoops, got my facts screwed up. IE and MS products do this for ISO-8859-15. The others do it for ISO-8859-1 only. Removing request for citation. --ChrisRuvolo (t) 19:03, 6 September 2006 (UTC)

[edit] 0x80-0x9F was not used in ISO Latin-1

The article says, "The encoding is a superset of ISO 8859-1, but differs by using displayable characters rather than control characters in the 0x80 to 0x9F range." But the ISO 8859-1 page has it that 0x80-0x9F is left unused. Which is right? --Apantomimehorse 02:44, 11 September 2006 (UTC)

i've made a minor correction to this article now. Plugwash 00:25, 27 September 2006 (UTC)

[edit] Table

The table in this article is really poor; for instance the difference between zero and O is almost invisible and the there is no difference between lower case l and upper case i. Worse, the difference between the various types of quotes is completely lost. I think we should follow the nice example at Code page 437 or at least use a font that can convey the differences of the characters. AxelBoldt 21:00, 6 October 2006 (UTC)

The FF hex character code (lower right corner of table) is AFAIK only defined in Windows ANSI, not in Latin-1. - Alf—The preceding unsigned comment was added by 81.191.161.87 (talk) 22:39, 18 February 2007 (UTC).

[edit] Alt key input

In fact, this method enters characters from "ANSI" and "OEM" codepages associated with current keyboard language/layout, not just 1251 and 437. Thus switching keyboard between say Russian and Norwegian one can enter different sets of characters.

[edit] Subset

Wow! I thought about making my change a minor one, I did not think anyone could seriously object to my modifications. I am not an experienced wikipedia contributor, so I wanted to play it save ... well.

I am not convinced by your arguments. It doesn't matter whether people use the C1 control codes of ISO 8859-1 or not. The octets in question are defined in both encodings, with different meanings. The term "subset" is, thus, wrong. One can argue that the average user benefits from the extra glyphs he can produce by using Windows-1252, more than from the contol codes of ISO 8859-1. However, this is a technical matter we are talking about; it helps to be precise. More so because readability was not disturbed and there where no information lost due to my editing. Traxer 17:04, 19 February 2007 (UTC)

One could say that the printable characters (the ones with visible glyphs) of windows-1252 are a superset of those in iso-8859-1, but not the complete set of characters (printable or control) in both encodings. The encodings themselves aren't "sets" (or "subsets" or "supersets"), because, strictly speaking, a mathematical set has no ordinal numbers assigned to its components (there is just a cardinal number of the set's membership), while a character encoding consists of a series of ordinal pairings between numbers and characters. *Dan T.* 19:50, 19 February 2007 (UTC)
The thing is, for the purposes of this page, abstract mathematical set theory and the Zermelo-Frankel axiom (or whatever) is all completely irrelevant. What matters is is what people commonly mean when they use "superset" in the context of computer software and character sets. However, if you want to assuage your mathematical conscience, you can reflect that the official ISO/IEC 8859-1 specification technically doesn't define the "C1" control area -- see the green areas in the table on that article page. AnonMoos 04:54, 20 February 2007 (UTC)
If ISO/IEC 8859-1 specification does not define the code values 0x80 to 0x9F, then the phrase "using displayable characters rather than control characters in the 0x80 to 0x9F range" is misleading.Traxer 17:08, 21 February 2007 (UTC)
Note the distinction between the standard "ISO/IEC 8859-1" IANA charset "ISO-8859-1", the former does not define any control codes, the latter does. Plugwash 20:34, 21 February 2007 (UTC)
The text compares Windows-1252 and ISO/IEC 8859-1, not Windows-1252 and IANA's ISO-8859-1. AnonMoos is right on that point, 0x80 to 0x9F are not defined. Traxer 10:27, 22 February 2007 (UTC)