Kostis Netzwerkberatung
Copyright (c) 1993-2000 by Kostis Netzwerkberatung
Talstr. 25, D-63322 Rödermark, Tel. +49 6074 881056, FAX 881058
kosta@kostis.net (Kosta Kostis), http://www.kostis.net/

This information may be used free of charge at your own risk.

trans V1.30 2001-02-22


MS-DOS Codepage Character Encoding Information

Not all MS-DOS Codepages are listed here.

Note: MS-DOS Codepages aka IBM Codepages in this document.

Code points 00-7F are identical to code points 00-7F in ISO/IEC 646:1991 (IRV).

MS-DOS Codepage 437 (US English)

file   cp437
languages supported   U.S.A. default Codepage.
ISO/IEC 8859 equivalent   ISO/IEC 8859-1:1998
characters missing   41

MS-DOS Codepage 737 (Greek IBM PC defacto Standard)

file   cp737
languages supported   mainly Greek
ISO/IEC 8859 equivalent   ISO 8859-7:1987
characters missing   10

With the advent of MS-DOS 6.2 it is a "normal" Codepage now. MS-DOS 6.2 stores this Character Encoding in the file ega2.cpi.

This set is suited for multiple-language applications involving the Latin and the Greek scripts. It allows handling of data and text expressed in Greek. It's the standard encoding for most HGC and CGA cards in Greece.

Erratically also called IBM Codepage 437, mostly because it was not implemented using the "normal" MS-DOS Codepage scheme, thus not altering the Codepage and the default Codepage for MS-DOS is IBM Codepage 437.

MS-DOS Codepage 850 (Multilingual - Latin 1)

file   cp850
languages supported   Danish, Dutch, English, Faeroese, Finnish, French, German, Icelandic, Irish, Italian, Norwegian, Portuguese, Spanish and Swedish.
ISO/IEC 8859 equivalent   ISO/IEC 8859-1:1998
characters missing   none

Allows 100% conversion to/from ISO 8859-1.

MS-DOS Codepage 851 (Greece) - obsolete

file   cp851
languages supported   mainly Greek
ISO/IEC 8859 equivalent   ISO 8859-7:1987
characters missing   too many

This character set is obsolete and is supplied for historical reasons only.

MS-DOS Codepage 852 (Multilingual - Latin 2)

file   cp852
languages supported   Albanian, Czech, English, German, Hungarian, Polish, Rumanian, (Serbo-)Croatian, Slovak, Slovene and Swedish.
ISO/IEC 8859 equivalent   ISO/IEC 8859-2:1999
characters missing   none
undefined code points   AA

Unicode mappings from Microsoft state otherwise, but checking the actual Codepage 852 using Microsoft MS-DOS 6.22 clearly shows nothing at that code point.

MS-DOS Codepage 853 (Multilingual - Latin 3)

file   cp853
languages supported   Afrikaans, Catalan, English, Esperanto, French, Galician, German, Italian, Maletese and Turkish.
ISO/IEC 8859 equivalent   ISO/IEC 8859-3:1999
characters missing   8
undefined code points   AA, D0-D1, DD, EE, F1, FB
no description, yet   F2

MS-DOS Codepage 855 (Russia) - obsolete

file   cp855
languages supported   mainly Russian
ISO/IEC 8859 equivalent   ISO/IEC 8859-5:1999
characters missing   2

In MS-DOS 6.22 this Character Encoding is supplied in the file ega3.cpi.

This character set is obsolete and is supplied for historical reasons only.

With the advent of MS-DOS 6.22 it is a "normal" Codepage.

MS-DOS Codepage 857 (Multilingual - Latin 5)

file   cp857
languages supported   Danish, Dutch, English, Finnish, French, German, Irish, Italian, Norwegian, Portuguese, Spanish, Swedish and Turkish.
ISO/IEC 8859 equivalent   ISO/IEC 8859-9:1999
characters missing    
undefined code points   D5, E7, F2

In MS-DOS 6.2 this Character Encoding is supplied in the file ega2.cpi.

With the advent of MS-DOS 6.2 it is a "normal" Codepage.

MS-DOS Codepage 860 (Portugal)

file   cp860

MS-DOS Codepage 861 (Iceland)

file   cp861

MS-DOS Codepage 862 (Israel)

file   cp862
languages supported   mainly Hebrew
ISO/IEC 8859 equivalent   ISO/IEC 8859-8:1999
characters missing   15

This set is suited for multiple-language applications involving the Latin and the Hebrew scripts. It allows handling of data and text expressed in Hebrew.

MS-DOS Codepage 863 (Canada (French)

file   cp863

MS-DOS Codepage 864 (Arabic)

file   cp864
languages supported   mainly Arabic
ISO/IEC 8859 equivalent   ISO/IEC 8859-6:1999
characters missing   ?
no description, yet   9B-9C, 9F, FF

This set is suited for multiple-language applications involving the Latin and the Arabic scripts. It allows handling of data and text expressed in Arabic.

This Codepage has BOX DRAWINGS characters below 20, but I decided not to include those characters there, because TAB, CR and LF would be missing then.

The Unicode Microsoft mappings currently use "ISOLATED FORM" variants of most characters which would make conversion to/from ISO 8859-6 impossible. I guess they are in error here. I hope to clarify this someday. I doubt that code point 25 is ARABIC PERCENT SIGN as indicated by Microsoft.

MS-DOS Codepage 865 (Norway)

file   cp865

MS-DOS Codepage 866 (Russia)

file   cp866
languages supported   mainly Russian
ISO/IEC 8859 equivalent   ISO/IEC 8859-5:1999
characters missing   21

In MS-DOS 6.22 this Character Encoding is supplied in the file ega3.cpi.

This set is suited for multiple-language applications involving the Latin and the Russian (Cyrillic) scripts. It allows handling of data and text expressed in Russian.

With the advent of MS-DOS 6.22 it is a "normal" Codepage.

MS-DOS Codepage 869 (Greek)

file   cp869
languages supported   mainly Greek
ISO/IEC 8859 equivalent   ISO 8859-7:1987
characters missing   8
no description, yet   80-85, 87, 93-94

In MS-DOS 6.2 this Character Encoding is supplied in the file ega2.cpi.

This set is suited for multiple-language applications involving the Latin and the Greek scripts. It allows handling of data and text expressed in Greek.

With the advent of MS-DOS 6.2 it is a "normal" Codepage.

MS-DOS Codepage 895 (Czech Kamenicky)

file   cp895
languages supported   mainly Czech and Slovak
ISO/IEC 8859 equivalent   ISO/IEC 8859-2:1999
characters missing   47

This set is suited for Czech and Slovak text. It has apparently been the standard encoding for at least CGA cards in the Czech and Slovak republic. It actually is just a modified cp437 suited to Czech and Slovak needs.

A more ISO/IEC 8859-2:1999 friendly Codepage is 852. Please use cp852 if you can.

ISO 8859-x to MS-DOS Codepage mappings

There are IBM Codepages compatible with ISO/IEC 8859, not only sharing the same characters but even the same code points.

ISO/IEC
Standard
IBM ISO
Codepage
IBM
Codepage
ISO/IEC 8859-1:1998 Latin 1 819 850
ISO/IEC 8859-2:1999 Latin 2 912 852
ISO/IEC 8859-3:1999 Latin 3 913 853
ISO/IEC 8859-4:1998 Latin 4 914 -
ISO/IEC 8859-5:1999 Cyrillic 915 (866)
ISO/IEC 8859-6:1999 Arabic 1089 (864)
ISO 8859-7:1987 Greek 813 (869)
ISO/IEC 8859-8:1999 Hebrew 916 (862)
ISO/IEC 8859-9:1999 Latin 5 920 857
ISO/IEC 8859-10:1999 Latin 6 919 -
ISO/IEC 8859-13:1999 Latin 7 (Baltic Rim)    
ISO/IEC 8859-14:1999 Latin 8 (Celtic)    
ISO/IEC 8859-15:1999 Latin 9    

(*) Note: These Codepages are similar but not all characters of the ISO/IEC character set are contained in the IBM Codepage.