General Internationalization Features
This section discusses several internationalization features contained in the Solaris 9 environment.
Support for Codeset Independence
EUC is an abbreviation for Extended UNIX Code. The Solaris 9 operating environment supports non-EUC encodings such as PC-Kanji (better known as Shift_JIS) in Japan, Big5 in Taiwan, and GBK in the People's Republic of China. Because a large part of the computer market demands non-EUC codeset support, the Solaris 9 environment provides a solid framework to enable both EUC and non-EUC codeset support. This support is called Codeset Independence, or CSI.
The goal of CSI is to remove dependencies on specific codesets or encoding methods from Solaris operating environment libraries and commands. The CSI architecture allows the Solaris operating environment to support any UNIX file system safe encoding. CSI supports a number of new codesets, such as UTF-8, PC-Kanji, and Big5.
CSI Approach
Codeset independence enables application and platform software developers to keep their code independent of any encoding, such as UTF-8, and also provides the ability to adopt any new encoding without having to modify the source code. This architecture approach differs from Java internationalization in that Java requires applications to be UTF-16-dependent.
Many existing internationalized applications (for example, Motif) automatically inherit CSI support from the underlying system. These applications work in the new locales without modification.
CSI is inherently independent from any codesets. However, the following assumptions about file code encodings (codesets) still apply to the Solaris 9 environment:
NULL byte value (0x00) does not appear as part of multibyte character bytes for support of null-terminated multibyte character strings.
ASCII Slash character byte value (0x2f) does not appear as part of multibyte character bytes for support of the UNIX path names.
CSI-enabled Commands
This section lists the CSI-enabled commands in the Solaris 9 environment. The man page for each command has an attribute section that indicates whether the command is CSI-enabled.
All commands are in the /usr/bin directory, unless otherwise noted.
- /usr/lib/diffh
- /usr/sbin/accept
- /usr/sbin/reject
- /usr/ucb/lpr
- /usr/xpg4/bin/awk
- /usr/xpg4/bin/cp
- /usr/xpg4/bin/date
- /usr/xpg4/bin/du
- /usr/xpg4/bin/ed
- /usr/xpg4/bin/edit
- /usr/xpg4/bin/egrep
- /usr/xpg4/bin/env
- /usr/xpg4/bin/ex
- /usr/xpg4/bin/expr
- /usr/xpg4/bin/fgrep
- /usr/xpg4/bin/lp
- /usr/xpg4/bin/ls
- /usr/xpg4/bin/more
- /usr/xpg4/bin/mv
- /usr/xpg4/bin/nice
- /usr/xpg4/bin/nohup
- /usr/xpg4/bin/od
- /usr/xpg4/bin/pr
- /usr/xpg4/bin/rm
- /usr/xpg4/bin/sed
- /usr/xpg4/bin/sort
- /usr/xpg4/bin/tail
- /usr/xpg4/bin/tr
- /usr/xpg4/bin/vedit
- /usr/xpg4/bin/vi
- /usr/xpg4/bin/view
- acctcom
- apropos
- batch
- bdiff
- cancel
- cat
- catman
- chgrp
- chmod
- chown
- cmp
- col
- comm
- compress
- cpio
- csh
- csplit
- cut
- diff
- diff3
- disable
- echo
- expand
- file
- find
- fold
- ftp
- gencat
- geteopt
- getoptcvt
- head
- join
- jsh
- kill
- ksh
- lp
- man
- mkdir
- msgfmt
- news
- nroff
- pack
- paste
- pcat
- pg
- printf
- priocntl
- ps
- pwd
- rcp
- red
- remsh
- rksh
- rsmdir
- rsh
- script
- sdiff
- settime
- sh
- split
- strconf
- strings
- sum
- tabs
- tar
- tee
- touch
- tty
- uncompress
- unexpand
- uniq
- unpack
- wc
- whatis
- write
- xargs
- zcat