Complete.Org: Mailing Lists: Archives: freeciv-dev: July 2002:
[Freeciv-Dev] network encoding in gui-gtk-2.0.
Home

[Freeciv-Dev] network encoding in gui-gtk-2.0.

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: freeciv-dev@xxxxxxxxxxx
Subject: [Freeciv-Dev] network encoding in gui-gtk-2.0.
From: SAWADA Katsuya <ama@xxxxxxxxxxx>
Date: Wed, 03 Jul 2002 09:07:55 +0900

I have two wishs related network encoding in gui-gtk-2.0.

1. Please default encoding from nl_langinfo(CODESET), not "ISO-8859-1".
2. Please use 'encoding' word instead 'charset'.


1. Please default encoding from nl_langinfo(CODESET), not "ISO-8859-1".

Now, If "FREECIV_NETWORK_CHARSET" is not set, network_charset go
"ISO-8859-1". But, I know good function to decide default encoding.
It is nl_langinfo(CODESET). I know the function can get terminal
encoding in client, not server, but sill useful.


2. Please use 'encoding' word instead 'charset'.

Word 'charset' confuse me because 'charset' have two meaning.  When I
set the environment variable "FREECIV_NETWORK_CHARSET", I didn't know
what I should set. If it is difficult to change all variable and
function in freeciv source code, I can accept that the environment
variable name change.

Following is from http://www.debian.org/doc/manuals/intro-i18n/
(Introduction to i18n).


     _Encoding_
          Encoding is a rule where characters and texts are expressed in
          combinations of bits or bytes in order to treat characters in
          computers.  Words of _character coding system_, _character code_,
          _charset_, and so on are used to express the same meaning.
          Basically, _encoding_ takes care of _characters_, not _glyphs_.
          There are many official and de-facto standards of encodings such
          as ASCII, ISO 8859-{1,2,...,15}, ISO 2022-{JP, JP-1, JP-2, KR,
          CN, CN-EXT, INT-1, INT-2}, EUC-{JP, KR, CN, TW}, Johab, UHC,
          Shift-JIS, Big5, TIS 620, VISCII, VSCII, so-called 'CodePages',
          UTF-7, UTF-8, UTF-16LE, UTF-16BE, KOI8-R, and so on so on.  To
          construct an encoding, we have to consider the following
          concepts.  (Encoding = one or more CCS + one CES).

     _Coded Character Set (CCS)_
          Coded character set (CCS) is a word defined in RFC 2050
          (http://www.faqs.org/rfcs/rfc2050.html) and means a character set
          where all characters have unique numbers by some method.  There
          are many national and international standards for CCS.  Many
          national standards for CCS adopt the way of coding so that they
          obey some of international standards such as ISO 646 or ISO 2022.
          ASCII, BS 4730, JISX 0201 Roman, and so on are examples of
          ISO-646 variants.  All ISO-646 variants, ISO 8859-*, JISX 0208,
          JISX 0212, KSX 1001, GB 2312, CNS 11643, CCCII, TIS 620, TCVN
          5712, and so on are examples of ISO 2022-compliant CCS.  VISCII
          and Big5 are examples of non-ISO 2022-compliant CCS.  UCS-2 and
          UCS-4 (ISO 10646) are also examples of CCS.


     _charset_ is also a well-used word.  This word is used very widely,
     for example, in MIME (like `Content-Type: text/plain,
     charset=iso8859-1'), in XLFD (X Logical Font Description) font name
     (CharSetResigtry and CharSetEncoding fields), and so on.  Note that
     _charset_ in MIME is _encoding_, while _charset_ in XLFD font name is
     _coded character set_.  This is very confusing.  In this document,
     _charset_ and _character set_ are used in XLFD meaning, since I think
     _character set_ should mean a set of characters, not encoding.



-- 
SAWADA Katsuya


[Prev in Thread] Current Thread [Next in Thread]