Complete.Org: Mailing Lists: Archives: freeciv-dev: June 2001:
[Freeciv-Dev] Re: [FreeCiv-Cvs] thue: Gettext tells me that en_GB and ja
Home

[Freeciv-Dev] Re: [FreeCiv-Cvs] thue: Gettext tells me that en_GB and ja

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Thue <thue@xxxxxxx>
Cc: freeciv-dev@xxxxxxxxxxx, Robert Brady <rwb197@xxxxxxxxxxxxxxx>, SAWADA Katsuya <amanatto@xxxxxxxxxxxxxxxx>
Subject: [Freeciv-Dev] Re: [FreeCiv-Cvs] thue: Gettext tells me that en_GB and ja needs some ...
From: Gaute B Strokkenes <gs234@xxxxxxxxx>
Date: 01 Jul 2001 00:07:04 +0100

On Fri, 29 Jun 2001, thue@xxxxxxx wrote:
> On Friday 29 June 2001 13:00, Gaute B Strokkenes wrote:
>> On Thu, 28 Jun 2001, freeciv@xxxxxxxxxxxxxxxxxxx wrote:
>> > This is an automated notification of a change to freeciv cvs,
>> > on Thu Jun 28 12:49:26 PDT 2001 = Thu Jun 28 19:49:26 2001 (GMT)
>> > by Thue Janus Kristensen <thue@xxxxxxx>
>> >
>> > ---- Files affected:
>> >
>> > freeciv/po en_GB.po ja.po
>> >
>> > ---- Log message:
>> >
>> > Gettext tells me that en_GB and ja needs some fixes before we
>> > can remove fussy mark, so reinserted.
>>
>> I get precisely the same warnings with msgfmt -c when I run it on
>> en_GB.po and ja.po with and without the fuzzy flag, but then I'm
>> running 0.10.35 here.  Could you show me the warning?
> 
> file=./`echo ja | sed 's,.*/,,'`.gmo \
>   && rm -f $file && /usr/bin/msgfmt --statistics -o $file ja.po
> ja.po:11476: ungültige Multibyte-Sequenz
> ja.po:11481: ungültige Multibyte-Sequenz
> es sind 2 fatale Fehler aufgetreten
> make[2]: *** [ja.gmo] Fehler 1
> make[2]: Leaving directory `/mnt/data/freeciv-dev/expciv/po'
> make[1]: *** [all-recursive] Fehler 1
> make[1]: Leaving directory `/mnt/data/freeciv-dev/expciv'
> make: *** [all-recursive-am] Fehler 2
> bash-2.05$ 
> 
> With fuzzy it doesn't complain at all.
> It is the funny 'u' in dunedain it doesn't like.

The problem is that gettext does not allow you to use different
charsets for the source (the strings that are passed to gettext()) and
the charset of the po file.  In practise, this means that the source
charset has to be a subset of the po charset, so that in general you
are restricted to ASCII in the source.  Using an 8-bit character such
as "ú" in the source directly violates this rule.  When the fuzzy mark
is removed from the header entry gettext will actually check that the
byte sequences have a sensible interpretation as text in the given
charset.

The proper fix is to change the source to read "Dunedain" and then add
a comment encouraging the translator to use "ú" if that character is
available in the charset that is being used.  I'll get working on it.

The en_GB.po is broken in this respect; it should either translate it
as "Dunedain" or change its charset to ISO-8859-1.  I've cc-ed the
en_GB.po translator so that he can fix it in the manner he feels is
most appropriate.

For ja.po I'm not sure what would be most appropriate, so I'm cc-ing
the translator as well to bring it to his attention.

-- 
Gaute Strokkenes BA  <-- Graduated yesterday 8-)


[Prev in Thread] Current Thread [Next in Thread]