Complete.Org: Mailing Lists: Archives: freeciv-dev: May 2004:
[Freeciv-Dev] Re: (PR#1824) ruleset data is in incompatible charsets
Home

[Freeciv-Dev] Re: (PR#1824) ruleset data is in incompatible charsets

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Kenn.Munro@xxxxxxxxxxxxxx, jdwheeler42@xxxxxxxxx, jrg45@xxxxxxxxxxxxxxxxx, pawel@xxxxxxxxxxxxxxx, per@xxxxxxxxxxx
Cc: mrproper@xxxxxxxxxx, jlangley@xxxxxxx
Subject: [Freeciv-Dev] Re: (PR#1824) ruleset data is in incompatible charsets
From: "Jason Short" <jdorje@xxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 6 May 2004 21:41:48 -0700
Reply-to: rt@xxxxxxxxxxx

<URL: http://rt.freeciv.org/Ticket/Display.html?id=1824 >

Jason Short wrote:
> <URL: http://rt.freeciv.org/Ticket/Display.html?id=1824 >
> 
>>Before we start with this can you outline:
>> - your solution
>> - a graphic with the data stores and flows and its encodings
>> - a list of changes to the code
> 
> Here's a flowchart.

See also http://freeciv.org/~jdorje/iconv.png.

Vasco pointed out that there are some shortcomings here.

- We *can* assume that the locale charset (in green) is a superset of 
ascii.  So it's safe to send ascii (black) to the terminal.

- Note that we *can't* assume the GUI charset (in red) is a superset of 
ascii.  For instance gui-sdl uses UTF-16 here.  We likewise can't assume 
anything about the relation of the green, red, and blue charsets.

- Input from the server is in the locale encoding, and may not 
necessarily be ascii.  For instance with the /create command you may 
give a non-ascii name.  This is in the green (local) encoding and must 
be converted to the blue (unicode) encoding during the read process. 
Likely this is a one-line conversion.

- The terminal isn't the only place we need to use the local encoding. 
We also need to use it with all syscalls.  Usually this won't be a 
problem.  But for instance someone could have a filename with non-ascii 
characters.  If they tried to load it through the GTK2 load dialog 
they'd get the correct string in the GUI encoding (utf-8), which would 
then be sent to the server in the universal encoding (utf-8) and passed 
to fopen which wants the locale encoding (which may not be utf-8).  So 
for some fopen calls we need to convert charsets again - while for 
others the argument is in ASCII and we need to be careful _not_ to 
convert charsets.  I'm not sure what other syscalls this might affect. 
However I think this is so rare we don't need to lose sleep over it.

jason




[Prev in Thread] Current Thread [Next in Thread]