ISO 8859

From Wikipedia, the free encyclopedia.

ISO 8859, more formally ISO/IEC 8859, is a joint ISO and IEC standard for 8-bit character encodings for use by computers. The standard is divided into numbered, separately published parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc., each of which may be informally referred to as a standard in and of itself. There are currently 15 parts.

Contents [hide]

1 Introduction

2 Characters

3 The Parts of ISO 8859

3.1 Table

4 Relationship to Unicode and the UCS

5 Development status

6 References

[edit]

Introduction

While the bit patterns of the 96 printable ASCII characters are sufficient to exchange information in modern English, most other languages that use the Roman alphabet need additional symbols not covered by ASCII, such as ß (German) and å (Swedish and other Nordic languages). ISO 8859 sought to remedy this problem by utilizing the eighth bit in an 8-bit byte in order to allow positions for another 128 characters. (This bit was previously used for data transmission protocol information, or was left unused.) However, more characters were needed than could fit in a single 8-bit character encoding, so several mappings were developed, including at least 10 just to cover the Latin script.

Although ISO 8859-n and ISO-8859-n are terms often used interchangeably, the ISO 8859 standard is not the same as the well-known ISO-8859-n character encodings approved by the IANA for use on the Internet. Besides the extra hyphen being present in the IANA-approved names, the encodings differ in that each part of the ISO standard assigns, at most, 191 characters to the byte ranges 32 to 126 and 160 to 255, whereas the corresponding IANA-approved character encoding merges these mappings with the C0 control set (control characters mapped to bytes 0 to 31) and the C1 control set (control characters mapped to bytes 127 to 159), resulting in a full 8-bit character map with most, if not all, bytes assigned.

[edit]

Characters

The ISO 8859 standard is designed for reliable information exchange, not typography; the standard omits symbols needed for high-quality typography, such as optional ligatures, curly quotation marks, dashes, etc. As a result, high-quality typesetting systems often use proprietary or idiosyncratic extensions on top of the ASCII and ISO 8859 standards, or use Unicode instead.

As a rule of thumb, if a character or symbol was not already part of a widely used data-processing character set and was also not usually provided on typewriter keyboards for a national language, it didn't get in. Hence the directional double quotation marks « and » used for some European languages were included, but not the directional double quotation marks “ and ” used for English and some other languages. French didn't get its œ and Œ ligatures because they could be typed as 'oe'. Ÿ, needed for all-caps text, was left out as well. These characters were, however, included later with ISO 8859-15, which also introduced the new Euro character €. Likewise Dutch did not get the 'ĳ' and 'Ĳ' letters, because Dutch speakers had gotten used to typing these as two letters instead. Romanian did not initially get its 'Ș/ș' and 'Ț/ț' letters, because these letters were initially unified with 'Ş/ş' and 'Ţ/ţ' by the Unicode Consortium, considering the shapes with comma beneath to be glyph variants of the shapes with cedilla. However, the letters with explicit comma below were later added to the Unicode standard and are also in ISO 8859-16.

Most of the ISO 8859 encodings provide diacritic marks required for various European languages. Others provide non-Roman alphabets: Greek, Cyrillic, Hebrew, Arabic and Thai. However, the standard makes no provision for the scripts of East Asian languages (CJK), as their ideographic writing systems require many thousands of code points. Although it uses Latin based characters, Vietnamese does not fit into 96 positions either; Japanese syllabic Kana scripts, on the other hand, might, but like several other alphabets of the world isn't encoded in the ISO 8859 system.

[edit]

The Parts of ISO 8859

ISO 8859 is divided into the following parts:

ISO 8859-1	Latin-1 Western European	Perhaps the most widely used part of ISO 8859, covering most Western European languages: Basque, Catalan, Danish, Dutch (partial^[1]), English, Faeroese, Finnish (partial²), French (partial^[2]), German, Icelandic, Irish, Italian, Norwegian, Portuguese, Rhaeto-Romanic, Scottish, Spanish, and Swedish, Eastern European Albanian, as well as the African languages Afrikaans and Swahili. The missing Euro symbol and capital Ÿ are in the revised version ISO 8859-15. The corresponding IANA-approved character set ISO-8859-1 is the default encoding for legacy HTML documents and for documents transmitted via MIME messages, such as HTTP responses when the document's media type is "text" (as in "text/html").
ISO 8859-2	Latin-2 Central European	supports those Central and Eastern European languages that use a Roman alphabet, including Polish, Croatian, Czech, Slovak, Slovenian, and Hungarian. The missing Euro symbol can be found in version ISO 8859-16.
ISO 8859-3	Latin-3 South European	Turkish, Maltese, and Esperanto; largely superseded by ISO 8859-9 for Turkish and Unicode for Esperanto.
ISO 8859-4	Latin-4 North European	Estonian, Latvian, Lithuanian, Greenlandic, and Sami.
ISO 8859-5	Cyrillic	Covers mostly Slavic languages that use a Cyrillic alphabet, including Belarusian, Bulgarian, Macedonian, Russian, Serbian, and Ukrainian (partial^[3]).
ISO 8859-6	Arabic	Covers the most common Arabic language characters. Doesn't support other languages using the Arabic script.
ISO 8859-7	Greek	Covers the modern Greek language (monotonic orthography). Can also be used for Ancient Greek written without accents or in monotonic orthography, but lacks the diacritics for polytonic orthography. These were introduced with Unicode.
ISO 8859-8	Hebrew	Covers the modern Hebrew alphabet as used in Israel. In practice two different encodings exist, logical and visual.
ISO 8859-9	Latin-5 Turkish	Largely the same as ISO 8859-1, replacing the rarely used Icelandic letters with Turkish ones. It is also used for Kurdish.
ISO 8859-10	Latin-6 Nordic	a rearrangement of Latin-4. Considered more useful for Nordic languages. Baltic languages use Latin-4 more.
ISO 8859-11	Thai	Contains most glyphs needed for the Thai language.
ISO 8859-12		was supposed to be Latin-7 and cover Celtic, but this draft was rejected. Numbering continued with -13.
ISO 8859-13	Latin-7 Baltic Rim	Added some glyphs for Baltic languages which were missing from Latin-4 and Latin-6.
ISO 8859-14	Latin-8 Celtic	Mostly a rearrangement of the ISO 8859-12 draft. Covers Celtic languages such as Gaelic and the Breton language.
ISO 8859-15	Latin-9	a revision of 8859-1 that removes some little-used symbols, replacing them with the Euro symbol € and the letters Š, š, Ž, ž, Œ, œ, and Ÿ, which completes the coverage of French, Finnish and Estonian.
ISO 8859-16	Latin-10 South-Eastern European	Intended for Albanian, Croatian, Hungarian, Italian, Polish, Romanian and Slovenian, but also Finnish, French, German and Irish Gaelic (new orthography). The focus lies more on letters than symbols. The currency sign is replaced with the Euro symbol.

^[1]—only the Ĳ/ĳ (letter IJ) is missing, which can be represented as IJ.
^[2]—missing characters are in ISO 8859-15.
^[3]—missing Ґ/ґ characters were reintroduced into Ukrainian in 1991.

Each part of ISO 8859 is designed to support languages that often borrow from each other, so the characters needed by each language are usually accommodated by a single part. However, there are some characters and language combinations that are not accommodated without transcriptions. Efforts were made to make conversions as smooth as possible. For example, German has all its seven special chars at the same positions in all Latin variants (1-4, 9-10, 13-16), and in many positions the characters only differ in the diacritics between the sets. In particular, variants 1-4 were designed jointly, and have the property that every encoded character appears either at a given position or not at all.

[edit]

Table

Comparison of the various parts of ISO 8859
Binary	Oct	Dec	Hex	1	2	3	4	5	6	7	8	9	10	11	13	14	15	16
10100000	240	160	A0	Non-breaking space (NBSP)
10100001	241	161	A1	¡	Ą	Ħ	Ą	Ё		‘		¡	Ą	ก	”	Ḃ	¡	Ą
10100010	242	162	A2	¢	˘	˘	ĸ	Ђ		’	¢	¢	Ē	ข	¢	ḃ	¢	ą
10100011	243	163	A3	£	Ł	£	Ŗ	Ѓ		£	£	£	Ģ	ฃ	£	£	£	Ł
10100100	244	164	A4	¤	¤	¤	¤	Є	¤	€	¤	¤	Ī	ค	¤	Ċ	€	€
10100101	245	165	A5	¥	Ľ		Ĩ	Ѕ		₯	¥	¥	Ĩ	ฅ	„	ċ	¥	„
10100110	246	166	A6	¦	Ś	Ĥ	Ļ	І		¦	¦	¦	Ķ	ฆ	¦	Ḋ	Š	Š
10100111	247	167	A7	§	§	§	§	Ї		§	§	§	§	ง	§	§	§	§
10101000	250	168	A8	¨	¨	¨	¨	Ј		¨	¨	¨	Ļ	จ	Ø	Ẁ	š	š
10101001	251	169	A9	©	Š	İ	Š	Љ		©	©	©	Đ	ฉ	©	©	©	©
10101010	252	170	AA	ª	Ş	Ş	Ē	Њ		ͺ	×	ª	Š	ช	Ŗ	Ẃ	ª	Ș
10101011	253	171	AB	«	Ť	Ğ	Ģ	Ћ		«	«	«	Ŧ	ซ	«	ḋ	«	«
10101100	254	172	AC	¬	Ź	Ĵ	Ŧ	Ќ	،	¬	¬	¬	Ž	ฌ	¬	Ỳ	¬	Ź
10101101	255	173	AD											ญ
10101110	256	174	AE	®	Ž		Ž	Ў			®	®	Ū	ฎ	®	®	®	ź
10101111	257	175	AF	¯	Ż	Ż	¯	Џ		―	¯	¯	Ŋ	ฏ	Æ	Ÿ	¯	Ż
10110000	260	176	B0	°	°	°	°	А		°	°	°	°	ฐ	°	Ḟ	°	°
10110001	261	177	B1	±	ą	ħ	ą	Б		±	±	±	ą	ฑ	±	ḟ	±	±
10110010	262	178	B2	²	˛	²	˛	В		²	²	²	ē	ฒ	²	Ġ	²	Č
10110011	263	179	B3	³	ł	³	ŗ	Г		³	³	³	ģ	ณ	³	ġ	³	ł
10110100	264	180	B4	´	´	´	´	Д		΄	´	´	ī	ด	“	Ṁ	Ž	Ž
10110101	265	181	B5	µ	ľ	µ	ĩ	Е		΅	µ	µ	ĩ	ต	µ	ṁ	µ	”
10110110	266	182	B6	¶	ś	ĥ	ļ	Ж		Ά	¶	¶	ķ	ถ	¶	¶	¶	¶
10110111	267	183	B7	·	ˇ	·	ˇ	З		·	·	·	·	ท	·	Ṗ	·	·
10111000	270	184	B8	¸	¸	¸	¸	И		Έ	¸	¸	ļ	ธ	ø	ẁ	ž	ž
10111001	271	185	B9	¹	š	ı	š	Й		Ή	¹	¹	đ	น	¹	ṗ	¹	č
10111010	272	186	BA	º	ş	ş	ē	К		Ί	÷	º	š	บ	ŗ	ẃ	º	ș
10111011	273	187	BB	»	ť	ğ	ģ	Л	؛	»	»	»	ŧ	ป	»	Ṡ	»	»
10111100	274	188	BC	¼	ź	ĵ	ŧ	М		Ό	¼	¼	ž	ผ	¼	ỳ	Œ	Œ
10111101	275	189	BD	½	˝	½	Ŋ	Н		½	½	½	―	ฝ	½	Ẅ	œ	œ
10111110	276	190	BE	¾	ž		ž	О		Ύ	¾	¾	ū	พ	¾	ẅ	Ÿ	Ÿ
10111111	277	191	BF	¿	ż	ż	ŋ	П	؟	Ώ		¿	ŋ	ฟ	æ	ṡ	¿	ż
11000000	300	192	C0	À	Ŕ	À	Ā	Р		ΐ		À	Ā	ภ	Ą	À	À	À
11000001	301	193	C1	Á	Á	Á	Á	С	ء	Α		Á	Á	ม	Į	Á	Á	Á
11000010	302	194	C2	Â	Â	Â	Â	Т	آ	Β		Â	Â	ย	Ā	Â	Â	Â
11000011	303	195	C3	Ã	Ă		Ã	У	أ	Γ		Ã	Ã	ร	Ć	Ã	Ã	Ă
11000100	304	196	C4	Ä	Ä	Ä	Ä	Ф	ؤ	Δ		Ä	Ä	ฤ	Ä	Ä	Ä	Ä
11000101	305	197	C5	Å	Ĺ	Ċ	Å	Х	إ	Ε		Å	Å	ล	Å	Å	Å	Ć
11000110	306	198	C6	Æ	Ć	Ĉ	Æ	Ц	ئ	Ζ		Æ	Æ	ฦ	Ę	Æ	Æ	Æ
11000111	307	199	C7	Ç	Ç	Ç	Į	Ч	ا	Η		Ç	Į	ว	Ē	Ç	Ç	Ç
11001000	310	200	C8	È	Č	È	Č	Ш	ب	Θ		È	Č	ศ	Č	È	È	È
11001001	311	201	C9	É	É	É	É	Щ	ة	Ι		É	É	ษ	É	É	É	É
11001010	312	202	CA	Ê	Ę	Ê	Ę	Ъ	ت	Κ		Ê	Ę	ส	Ź	Ê	Ê	Ê
11001011	313	203	CB	Ë	Ë	Ë	Ë	Ы	ث	Λ		Ë	Ë	ห	Ė	Ë	Ë	Ë
11001100	314	204	CC	Ì	Ě	Ì	Ė	Ь	ج	Μ		Ì	Ė	ฬ	Ģ	Ì	Ì	Ì
11001101	315	205	CD	Í	Í	Í	Í	Э	ح	Ν		Í	Í	อ	Ķ	Í	Í	Í
11001110	316	206	CE	Î	Î	Î	Î	Ю	خ	Ξ		Î	Î	ฮ	Ī	Î	Î	Î
11001111	317	207	CF	Ï	Ď	Ï	Ī	Я	د	Ο		Ï	Ï	ฯ	Ļ	Ï	Ï	Ï
11010000	320	208	D0	Ð	Đ		Đ	а	ذ	Π		Ğ	Ð	ะ	Š	Ŵ	Ð	Đ
11010001	321	209	D1	Ñ	Ń	Ñ	Ņ	б	ر	Ρ		Ñ	Ņ	ั	Ń	Ñ	Ñ	Ń
11010010	322	210	D2	Ò	Ň	Ò	Ō	в	ز			Ò	Ō	า	Ņ	Ò	Ò	Ò
11010011	323	211	D3	Ó	Ó	Ó	Ķ	г	س	Σ		Ó	Ó	ำ	Ó	Ó	Ó	Ó
11010100	324	212	D4	Ô	Ô	Ô	Ô	д	ش	Τ		Ô	Ô	ิ	Ō	Ô	Ô	Ô
11010101	325	213	D5	Õ	Ő	Ġ	Õ	е	ص	Υ		Õ	Õ	ี	Õ	Õ	Õ	Ő
11010110	326	214	D6	Ö	Ö	Ö	Ö	ж	ض	Φ		Ö	Ö	ึ	Ö	Ö	Ö	Ö
11010111	327	215	D7	×	×	×	×	з	ط	Χ		×	Ũ	ื	×	Ṫ	×	Ś
11011000	330	216	D8	Ø	Ř	Ĝ	Ø	и	ظ	Ψ		Ø	Ø	ุ	Ų	Ø	Ø	Ű
11011001	331	217	D9	Ù	Ů	Ù	Ų	й	ع	Ω		Ù	Ų	ู	Ł	Ù	Ù	Ù
11011010	332	218	DA	Ú	Ú	Ú	Ú	к	غ	Ϊ		Ú	Ú	ฺ	Ś	Ú	Ú	Ú
11011011	333	219	DB	Û	Ű	Û	Û	л		Ϋ		Û	Û		Ū	Û	Û	Û
11011100	334	220	DC	Ü	Ü	Ü	Ü	м		ά		Ü	Ü		Ü	Ü	Ü	Ü
11011101	335	221	DD	Ý	Ý	Ŭ	Ũ	н		έ		İ	Ý		Ż	Ý	Ý	Ę
11011110	336	222	DE	Þ	Ţ	Ŝ	Ū	о		ή		Ş	Þ		Ž	Ŷ	Þ	Ț
11011111	337	223	DF	ß	ß	ß	ß	п		ί	‗	ß	ß	฿	ß	ß	ß	ß
11100000	340	224	E0	à	ŕ	à	ā	р	ـ	ΰ	א	à	ā	เ	ą	à	à	à
11100001	341	225	E1	á	á	á	á	с	ف	α	ב	á	á	แ	į	á	á	á
11100010	342	226	E2	â	â	â	â	т	ق	β	ג	â	â	โ	ā	â	â	â
11100011	343	227	E3	ã	ă		ã	у	ك	γ	ד	ã	ã	ใ	ć	ã	ã	ă
11100100	344	228	E4	ä	ä	ä	ä	ф	ل	δ	ה	ä	ä	ไ	ä	ä	ä	ä
11100101	345	229	E5	å	ĺ	ċ	å	х	م	ε	ו	å	å	ๅ	å	å	å	ć
11100110	346	230	E6	æ	ć	ĉ	æ	ц	ن	ζ	ז	æ	æ	ๆ	ę	æ	æ	æ
11100111	347	231	E7	ç	ç	ç	į	ч	ه	η	ח	ç	į	็	ē	ç	ç	ç
11101000	350	232	E8	è	č	è	č	ш	و	θ	ט	è	č	่	č	è	è	è
11101001	351	233	E9	é	é	é	é	щ	ى	ι	י	é	é	้	é	é	é	é
11101010	352	234	EA	ê	ę	ê	ę	ъ	ي	κ	ך	ê	ę	๊	ź	ê	ê	ê
11101011	353	235	EB	ë	ë	ë	ë	ы	ً	λ	כ	ë	ë	๋	ė	ë	ë	ë
11101100	354	236	EC	ì	ě	ì	ė	ь	ٌ	μ	ל	ì	ė	์	ģ	ì	ì	ì
11101101	355	237	ED	í	í	í	í	э	ٍ	ν	ם	í	í	ํ	ķ	í	í	í
11101110	356	238	EE	î	î	î	î	ю	َ	ξ	מ	î	î	๎	ī	î	î	î
11101111	357	239	EF	ï	ď	ï	ī	я	ُ	ο	ן	ï	ï	๏	ļ	ï	ï	ï
11110000	360	240	F0	ð	đ		đ	ȑ	ِ	π	נ	ğ	ð	๐	š	ŵ	ð	đ
11110001	361	241	F1	ñ	ń	ñ	ņ	ё	ّ	ρ	ס	ñ	ņ	๑	ń	ñ	ñ	ń
11110010	362	242	F2	ò	ň	ò	ō	ђ	ْ	ς	ע	ò	ō	๒	ņ	ò	ò	ò
11110011	363	243	F3	ó	ó	ó	ķ	ѓ		σ	ף	ó	ó	๓	ó	ó	ó	ó
11110100	364	244	F4	ô	ô	ô	ô	є		τ	פ	ô	ô	๔	ō	ô	ô	ô
11110101	365	245	F5	õ	ő	ġ	õ	ѕ		υ	ץ	õ	õ	๕	õ	õ	õ	ő
11110110	366	246	F6	ö	ö	ö	ö	і		φ	צ	ö	ö	๖	ö	ö	ö	ö
11110111	367	247	F7	÷	÷	÷	÷	ї		χ	ק	÷	ũ	๗	÷	ṫ	÷	ś
11111000	370	248	F8	ø	ř	ĝ	ø	ј		ψ	ר	ø	ø	๘	ų	ø	ø	ű
11111001	371	249	F9	ù	ů	ù	ų	љ		ω	ש	ù	ų	๙	ł	ù	ù	ù
11111010	372	250	FA	ú	ú	ú	ú	њ		ϊ	ת	ú	ú	๚	ś	ú	ú	ú
11111011	373	251	FB	û	ű	û	û	ћ		ϋ		û	û	๛	ū	û	û	û
11111100	374	252	FC	ü	ü	ü	ü	ќ		ό		ü	ü		ü	ü	ü	ü
11111101	375	253	FD	ý	ý	ŭ	ũ	§		ύ	LRM	ı	ý		ż	ý	ý	ę
11111110	376	254	FE	þ	ţ	ŝ	ū	ў		ώ	RLM	ş	þ		ž	ŷ	þ	ț
11111111	377	255	FF	ÿ	˙	˙	˙	џ				ÿ	ĸ		’	ÿ	ÿ	ÿ

At position 0xA0 there's always the non breaking space and 0xAD is mostly the soft hyphen, which only shows at line breaks. Other empty fields are either unassigned or the system used isn't able to display them.

There are new additions as ISO/IEC 8859-7:2003 and ISO/IEC 8859-8:1999 versions. LRM stands for left-to-right mark (U+200E) and RLM stands for right-to-left mark (U+200F).

[edit]

Relationship to Unicode and the UCS

Since 1991, the Unicode Consortium has been working with ISO to develop the Unicode Standard and ISO/IEC 10646: the Universal Character Set (UCS) in tandem. This pair of standards was created to unify the ISO 8859 character repertoire, among others, by assigning each character, initially, to a 16-bit code value, with some code values left unassigned. Over time, their models adapted to map characters to abstract numeric code points rather than fixed bit-width values, so that more code points and encoding methods could be supported.

Unicode and ISO/IEC 10646 currently assign about 100,000 characters to a code space consisting of over a million code points, and they define several standard encodings that are capable of representing every available code point. The standard encodings of Unicode and the UCS use sequences of one to four 8-bit code values (UTF-8), sequences of one or two 16-bit code values (UTF-16), or one 32-bit code value (UTF-32 or UCS-4). There is also an older encoding that uses one 16-bit code value (UCS-2), capable of representing one-seventeenth of the available code points. Of these encoding forms, only UTF-8's byte sequences are in a fixed order; the others are subject to platform-dependent byte ordering issues that may be addressed via special codes or indicated via out-of-band means.

Newer editions of ISO 8859 express characters in terms of their Unicode/UCS names and the U+nnnn notation, effectively causing each part of ISO 8859 to be a Unicode/UCS character encoding scheme that maps a very small subset of the UCS to single 8-bit bytes. The first 256 characters in Unicode and the UCS are identical to those in ISO-8859-1.

ISO 8859 was favored throughout the 1990s, having the advantages of being well-established and more easily implemented in software: the equation of one byte to one character is simple and adequate for most single-language applications, and there are no combining characters or variant forms.

As the relative cost, in computing resources, of using more than one byte per character began to diminish, programming languages and operating systems added native support for Unicode alongside their system of code pages. As Unicode-enabled operating systems became more widespread, ISO 8859 and other legacy encodings became less popular. While remnants of ISO 8859 and single-byte character models remain entrenched in many operating systems, programming languages, data storage systems, networking applications, display hardware, and end-user application software, most modern computing applications use Unicode internally, and rely on conversion tables to map to and from the simpler encodings, when necessary.

[edit]

Development status

The ISO/IEC 8859 standard was maintained by ISO/IEC Joint Technical Committee 1, Subcommittee 2, Working Group 3 (ISO/IEC JTC 1/SC 2/WG 3). In June 2004, WG 3 disbanded, and maintenance duties were transferred to SC 2. The standard is not currently being updated, as the Subcommittee's only remaining Working Group, WG 2, is concentrating on development of ISO/IEC 10646.

[edit]

References

Published versions of each part of ISO/IEC 8859 are available, for a fee, from the ISO catalogue site (http://www.iso.ch/iso/en/stdsdevelopment/tc/tclist/TechnicalCommitteeStandardsListPage.TechnicalCommitteeStandardsList?COMMID=23) and from the ANSI eStandards Store (http://webstore.ansi.org/ansidocstore/find.asp?find_spec=8859).

PDF versions of the final drafts of some parts of ISO/IEC 8859 as submitted for review & publication by ISO/IEC JTC 1/SC 2/WG 3 are available at the WG 3 web site (http://anubis.dkuug.dk/JTC1/SC2/WG3/):
- ISO/IEC 8859-1:1998 (http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n411.pdf) - 8-bit single-byte coded graphic character sets, Part 1: Latin alphabet No. 1 (draft dated February 12, 1998, published April 15, 1998)
- ISO/IEC 8859-4:1998 (http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n413.pdf) - 8-bit single-byte coded graphic character sets, Part 4: Latin alphabet No. 4 (draft dated February 12, 1998, published July 1, 1998)
- ISO/IEC 8859-7:1999 (http://anubis.dkuug.dk/jtc1/sc2/open/02n3329.pdf) - 8-bit single-byte coded graphic character sets, Part 7: Latin/Greek alphabet (draft dated June 10, 1999; superseded by ISO/IEC 8859-7:2003, published October 10, 2003)
- ISO/IEC 8859-10:1998 (http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n415.pdf) - 8-bit single-byte coded graphic character sets, Part 10: Latin alphabet No. 6 (draft dated February 12, 1998, published July 15, 1998)
- ISO/IEC 8859-11:1999 (http://anubis.dkuug.dk/jtc1/sc2/open/02n3333.pdf) - 8-bit single-byte coded graphic character sets, Part 11: Latin/Thai character set (draft dated June 22, 1999; superseded by ISO/IEC 8859-11:2001, published Dec 15, 2001)
- ISO/IEC 8859-13:1998 (http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n451.pdf) - 8-bit single-byte coded graphic character sets, Part 13: Latin alphabet No. 7 (draft dated April 15, 1998, published October 15, 1998)
- ISO/IEC 8859-15:1998 (http://anubis.dkuug.dk/JTC1/SC2/WG3/docs/n404.pdf) - 8-bit single-byte coded graphic character sets, Part 15: Latin alphabet No. 9 (draft dated August 1, 1997; superseded by ISO/IEC 8859-15:1999, published March 15, 1999)
- ISO/IEC 8859-16:2000 (http://anubis.dkuug.dk/jtc1/sc2/open/02n3389.pdf) - 8-bit single-byte coded graphic character sets, Part 16: Latin alphabet No. 10 (draft dated November 15, 1999; superseded by ISO/IEC 8859-16:2001, published July 15, 2001)

ECMA standards, which in intent correspond exactly to the ISO/IEC 8859 character set standards, can be found at:
- Standard ECMA-94 (http://www.ecma-international.org/publications/standards/Ecma-094.htm): 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
- Standard ECMA-113 (http://www.ecma-international.org/publications/standards/Ecma-113.htm): 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Cyrillic Alphabet 3rd edition (December 1999)
- Standard ECMA-114 (http://www.ecma-international.org/publications/standards/Ecma-114.htm): 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Arabic Alphabet 2nd edition (December 2000)
- Standard ECMA-118 (http://www.ecma-international.org/publications/standards/Ecma-118.htm): 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Greek Alphabet (December 1986)
- Standard ECMA-121 (http://www.ecma-international.org/publications/standards/Ecma-121.htm): 8-Bit Single-Byte Coded Graphic Character Sets - Latin/Hebrew Alphabet 2nd edition (December 2000)
- Standard ECMA-128 (http://www.ecma-international.org/publications/standards/Ecma-128.htm): 8-Bit Single-Byte Coded Graphic Character Sets - Latin Alphabet No. 5 2nd edition (December 1999)
- Standard ECMA-144 (http://www.ecma-international.org/publications/standards/Ecma-144.htm): 8-Bit Single-Byte Coded Character Sets - Latin Alphabet No. 6 3rd edition (December 2000)

ISO/IEC 8859-1 to Unicode mapping tables (ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859) as plain text files are at the Unicode FTP site.

Informal descriptions and code charts for most ISO 8859 standards are available in ISO 8859 Alphabet Soup (http://czyborra.com/charsets/iso8859.html) (Mirror) (http://www.lysator.liu.se/~jmo/czyborra_index.html)

Retrieved from "http://en.wikipedia.org/wiki/ISO_8859"

Categories: ISO 8859

ISO 8859

From Wikipedia, the free encyclopedia.

Introduction

Characters

The Parts of ISO 8859

Table

Relationship to Unicode and the UCS

Development status

References

Views

Personal tools

Navigation

Search

Toolbox

In other languages