Sun Microsystems, Inc.
spacerspacer
spacer www.sun.com docs.sun.com |
spacer
black dot
 
 
1.  Solaris Internationalization Overview Using Locale Categories for Localization Numbers   Previous   Contents   Next 
   
 

Currency

Currency units and presentation order vary greatly around the world. Local and international symbols for currency can differ. The following table shows monetary formats in some countries.

Table 1-4 International Monetary Conventions

Locale

Currency

Example

Canadian (English)

Dollar ($)

$1,234.56

Canadian (French

Dollar ($)

1 234,56$

Danish

Kroner (kr)

Kr 1.234,56

Finnish

Euro ()

1 234,56

French

Euro ()

1,234

Japanese

Yen (¥)

¥ 1,234

Norwegian

Krone (kr)

kr 1.234,56

Swedish

Krona (Kr)

1 234,56 Kr

Great Britain

Pound (L)

L1,234.56

United States

Dollar ($)

$1,234.56

Thai

Baht

2539 Baht

Euro

Euro ()

5,000

The Solaris 9 software supports the euro currency. Local currency symbols are still available for backward compatibility.

Table 1-5 User Locales To Support the Euro Currency

Region

Locale Name

ISO Codeset

Austria

de_AT.ISO8859-15

8859-15

Belgium (French)

fr_BE.ISO8859-15

8859-15

Belgium (Flemish)

nl_BE.ISO8859-15

8859-15

Denmark

da_DK.ISO8859-15

8859-15

Finland

fi_FI.ISO8859-15

8859-15

France

fr_FR.ISO8859-15

8859-15

Germany

de_DE.ISO8859-15

8859-15

Ireland

en_IE.ISO8859-15

8859-15

Italy

it_IT.ISO8859-15

8859-15

Netherlands

nl_NL.ISO8859-15

8859-15

Portugal

pt_PT.ISO8859-15

8859-15

Catalan Spain

ca_ES.ISO8859-15

8859-15

Estonia

et_EE.ISO8859-15

8859-15

Spain

es_ES.ISO8859-15

8859-15

Sweden

sv_SE.ISO8859-15

8859-15

Great Britain

en_GB.ISO8859-15

8859-15

U.S.A.

en_US.ISO8859-15

8859-15

Euro locales are based on the ISO8859-15 codeset.

Keep in mind that a converted currency amount can take up more or less space than the original amount. To illustrate: $1,000 can become 1.307.000.

The current status of the locale settings for locales within the euro zone is illustrated for the LC_MONETARY operand of the locale utility. The status for Germany, for example, is shown in the following table.

Table 1-6 German Locale and Corresponding LC_MONETARY

Locale

LC_MONETARY

de_DE.ISO8859-1

DM

de_DE.ISO8859-15

Euro

de_DE.UTF-8

Euro

de_DE.ISO8859-15@euro

Euro

de_DE.UTF-8@euro

Euro

Language Word and Letter Differences

This section describes important differences between languages.

Word Delimiters

In English, words are usually separated by a space character. In languages such as Chinese, Japanese, and Thai, however, there is often no delimiter between words.

Sort Order

Sorting order for particular characters is not the same in all languages. For example, the character "ö" sorts with the ordinary "o" in Germany, but sorts separately in Sweden, where it is the last letter of the alphabet. In some languages, characters have weight to determine the priority of the character sequences. For example, the Thai dictionary defines sorting through the sequences of characters that have different weights.

Character Sets

Character sets can differ in the number of alphabetic characters and special characters. While the English alphabet contains only 26 characters, some languages contain many more characters. Japanese, for example, can contain over 20,000 characters, Chinese can contain even more characters.

Western European Alphabets

The alphabets of most western European countries are similar to the standard 26-character alphabet used in English-speaking countries, but there are often some additional basic characters, some marked (or accented) characters, and some ligatures.

Japanese Text

Japanese text is composed of three different scripts mixed together: Kanji ideographs derived from Chinese, and two phonetic scripts (or syllabaries), hiragana and katakana.

Although each character in hiragana has an equivalent in katakana, hiragana is the most common script, with cursive rather than block-like letter forms. Kanji characters are used to write root words. Katakana is mostly used to represent "foreign" words, that is, words "imported" from languages other than Japanese.

Kanji has tens of thousands of characters, but the number commonly used has been declining steadily over the years. Now only about 3500 are frequently used, although the average Japanese writer has a vocabulary of about 2000 kanji characters. Nonetheless, computer systems must support more than 7000 because that is what the Japan Industry Standard (JIS) requires. In addition, there are about 170 hiragana and katakana characters. On average, 55% of Japanese text is hiragana, 35% kanji, and 10% katakana. Arabic numerals and Roman letters are also present in Japanese text.

Although completely avoiding the use of kanji is possible, most Japanese readers find a text that is composed without any kanji hard to understand.

Korean Text

Korean text can be written using a phonetic writing system called Hangul. Hangul has more than 11,000 characters, which consist of consonants and vowels known as jamos. About 3000 characters from the entire Hangul vocabulary of characters are usually used in Korean computer systems. Korean also uses ideographs based on the set invented in China, called hanja. Korean text requires over 6000 hanja characters. Hanja is used mostly to avoid confusion when Hangul would be ambiguous. Hangul characters are formed by combining consonants and vowels. After combining them, they can compose one syllable, which is a Hangul character. Hangul characters are often arranged in a square, so that the group takes up the same space as a hanja character. Arabic numerals, Roman letters, and special symbol characters are also present in Korean text.

Thai Text

A Thai character can be defined as a column position on a display screen with four display cells. Each column position can have up to three characters. The composition of a display cell is based on the Thai character's classification. Some Thai characters can be composed with another character's classification. If they can be composed together, both characters are in the same cell. Otherwise, they are in separate cells.

Chinese Text

Chinese usually consists entirely of characters from the ideographic script called hanzi.

  • In the People's Republic of China (PRC) there are about 7000 commonly used hanzi characters in the GB2312 (zh locale), more than 20,000 characters in the GBK charset (zh.GBK locale), and about 30,000 characters in the GB18030-2000 charset (zh_CN.GB18030 locale), including all CJK extension A characters defined in Unicode 3.0.

  • In Taiwan, the most frequently used charsets are the CNS11643-1992 (zh_TW locale) and the Big5 (zh_TW.BIG5 locale). They share about 13,000 hanzi characters.

  • In Hong Kong, 4702 characters have been added into the Big5 charset to become the Big5-HKSCS charset (zh_HK.BIG5HK).

If a character is not a root character, it usually consists of two or more parts, two being most common. In two-part characters, one part generally represents meaning, and the other represents pronunciation. Occasionally both parts represent meaning. The radical is the most important element, and characters are traditionally arranged by radical, of which there are several hundred. A single sound can be represented by many different characters, which are not interchangeable in usage. A single character can have different sounds.

Some characters are more appropriate than others in a given context--the appropriate one is distinguished phonetically by the use of tones. By contrast, spoken Japanese and Korean lack tones.

Several phonetic systems represent Chinese. In the People's Republic of China the most common is pinyin, which uses Roman characters and is widely employed in the West for place names such as Beijing. The Wade-Giles system is an older phonetic system, formerly used for place names such as Peking. In Taiwan zhuyin (or bopomofo), a phonetic alphabet with unique letter forms, is often used instead.

Hebrew Text

Hebrew text is used for writing scripts in the Hebrew and Yiddish languages, and predates the English language by thousands of years. Hebrew is an example of a bidirectional script, in that Hebrew letters are written and read from right to left, while numbers are read from left to right. Any English text that is embedded in Hebrew text is also read from left to right.

Hebrew uses a 27-character alphabet, and takes punctuation marks and numbers from the standard Latin (or English) alphabet. Hebrew text also includes vowel and pronunciation marks. These marks appear either as a dot (Dagesh) inside the base character, vowel marks below the character, or accents to the upper left of the character. These marks are generally only used in liturgical text, and are rarely seen in day-to-day use. There are also no uppercase letters in Hebrew.

 
 
 
  Previous   Contents   Next