Complete.Org: Mailing Lists: Archives: freeciv-dev: July 2001:
[Freeciv-Dev] Re: Upper ascii characters in city names etc.
Home

[Freeciv-Dev] Re: Upper ascii characters in city names etc.

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Thue <thue@xxxxxxx>
Cc: freeciv-dev@xxxxxxxxxxx
Subject: [Freeciv-Dev] Re: Upper ascii characters in city names etc.
From: Gaute B Strokkenes <gs234@xxxxxxxxx>
Date: Sat, 21 Jul 2001 19:51:57 +0200

On Thu, 19 Jul 2001, thue@xxxxxxx wrote:
> As you know the gtk client can bug if the strings it displays
> contains upper ascii characters (those that change between charsets)
> and the locale is set to C.

LC_CTYPE = "C" is not much of a problem.  The gringolandians will have
to set the locale properly like everyone else.  BFD.

  donald:~$ LC_ALL=en_US locale charmap
  ISO-8859-1

The real problem is what to when the ISO-8859-1 characters are not
available in whatever charset the client is using.  I think that the
proper solution is to use some sort of transliteration, as you have
proposed.

The long-term solution for all such problems is to use Unicode and
UTF-8 for these purposes.

Here is my proposal in broad forms:

* All ruleset files should be in UTF-8.  Likewise for all
  `namestrings' (that is, strings of length MAX_LEN_NAME) on the wire,
  that is, in communication between client and server.

* Translatable strings in ruleset files should be in their own text
  domain, separate from translatable strings in Freeciv proper.
  Individual ruleset files should be able to specify the domain name.
  For Freeciv, namestrings should be Latin only (that is, no Cyrillic,
  Arabic or Greek characters or CJK ideographs etc.) on the grounds
  that we can't transliterate it properly, and because while Latin
  with funny diacritics is intelligible to anyone using a computer
  capable of running Freeciv, other alphabets are not.  In any case,
  if, say, a group of Russian want proper Russian names, this is
  allowed for by the use of gettext.

* For a given town, ruler etc. the string should be the common Latin
  transcription of the name.  The `native' transcription should be
  used when available; when there is no `native' method available the
  English version should be used.  Thus Göteborg trumps Gothenburg,
  and Yeltsin trumps Jeltsin.

* The server and clients will have to perform conversion as necessary.
  This could be done at the packet level if necessary, so it should
  not be too much of a problem.  There should be a fallback for Latin
  characters that are not available (removing diacritics etc).

I promised to put this forward some time ago, but I didn't get round
to it until now, partly because I've been busy with work and the other
things that needed fixing in Freeciv, and partly because I've been
looking into the needed infrastructure.  I think that solid,
ubiquitous UTF-8 support is just around the corner.  For instance, the
next GTK+ version will use UTF-8 everywhere internally, and the latest
versions of Xfree86 and glibc both include very good UTF-8 support.
The next Emacs will also have out-of-the-box support for UTF-8.

In the short term, I believe that we should simply discourage people
from running Freeciv in a ASCII-only locale.  Perhaps we could use a
test to silently change C/POSIX/ANSI_X3.4-1968 to ISO-8859-1.  If
we're going to implement the proper solution afterwards, it doesn't
make sense to put _too_ much effort into it.

> Fx, what is the correct approximation for "ç"?

c,

-- 
Big Gaute                               http://www.srcf.ucam.org/~gs234/
I'm ANN LANDERS!!  I can SHOPLIFT!!


[Prev in Thread] Current Thread [Next in Thread]