ASCII

ASCII (American Standard Code for Information Interchange) is a character encoding standard for electronic communication. It is limited to English letters, digits (0-9), special characters (like  !@#) and does not support characters from other languages.

Extended ASCII

Primarily used in early computers and simple text files. The range from 128 to 255 is often referred to as the "extended ASCII" range. This range is not defined by the original ASCII standard but was used in various extended ASCII encodings. Different systems and languages used different extended ASCII sets, which could lead to compatibility issues. For example, IBM's Code Page 437, ISO 8859-1(Latin-1), and Windows-1252 are extended ASCII sets but have different characters in the 128-255 range. These extended characters (128-255) are often called "upper-128 characters" or "top 128 characters" because they occupy the second half of an 8-bit byte (2^8=256, 256/2=128).

A byte is 8 bits. Why it has a word 8-bit bytes to emphasis?

Originally, the term "byte" did not have a standardised size. Early computers used different byte sizes, including 6, 7, 8, 9, or more bits per byte. Over time, the 8-bit byte became the standard for most modern computer systems. Today, the term "8-bit byte" is sometimes used to be explicit, be used for clarity and to avoid confusion in contexts, especially in technical contexts or when dealing with older systems or documentation that might refer to different byte sizes.

ANSI

ANSI character sets extend ASCII by using the values 128-255 to include additional characters, these are often called Windows code pages. ANSI refers to a family of 8-bit character encodings used primarily on Microsoft Windows. Windows-1252 is one of the most common ANSI character sets, used for Western European languages.

OEM

Original equipment manufacturer character sets were developed for specific hardware platforms, particularly for early personal computers. For example, IBM's Code Page 437.

DBCS

Double-Byte Character Sets use either 1-2 bytes to represent a character.

DBCS is primarily tailored to large character sets languages, such as Traditional Chinese(Big5), Japanese(Shift-JIS). DBCS was widely used before Unicode became the standard for text encoding.

Unicode

Unicode is a universal character encoding standard designed to represent languages text and symbols from all the world’s writing systems. Used universally across modern computer systems and software to ensure consistent representation.