Complete.Org: Mailing Lists: Archives: gopher: August 2008:
[gopher] Re: Gopherness
Home

[gopher] Re: Gopherness

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: gopher@xxxxxxxxxxxx
Subject: [gopher] Re: Gopherness
From: nunojsilva@xxxxxxxxxx (Nuno J. Silva)
Date: Sun, 10 Aug 2008 21:34:57 +0100
Reply-to: gopher@xxxxxxxxxxxx

JumpJet Mailbox <jumpjetinfo@xxxxxxxxx>
writes:

> --- On Mon, 8/4/08, Nuno J. Silva
> <nunojsilva@xxxxxxxxxx> wrote:
>>
>> "Jay Nemrow" <jnemrow@xxxxxxx> writes:
>>
>>> On Mon, Aug 4, 2008 at 10:19 AM, Kyevan
>>> <kyevan@xxxxxxxxxxx> wrote:
>>>
>>>> What about older clients, though? Modern clients will probably
>>>> handle UTF-8 at least well enough to not explode, but older clients
>>>> might not.  Generally, it seems safest to stick to the subset that
>>>> is ASCII when reasonable, only using UTF-8 or such when it's
>>>> actually needed. ... is a perfectly readable replacement for
>>>> U+2026, even if it's not "typographically correct." On the other
>>>> hand, if you're trying to post a text in, say, a mix of Arabic, and
>>>> Klingon, go right ahead and use UTF-8.
>>
>> There are also these iso* charsets which just use 8 bit to encode the
>> text, not allowing a greater collection of characters, and using
>> those you wouldn't be able to mix charsets.
<snip/>
>> On the other hand, even if the choice was utf8 (so the documents would
>> be ASCII or utf8), I'd keep iso* support, just in case (therefore my
>> question is 'should we use the same sort of character encoding when
>> publishing non-english documents? if yes, which one?' and not 'what
>> should a client support?').
>>
>> What's the actual scenario? Is there any client which crashes due to
>> utf8? Which clients are not able to render it correctly? And what
>> about iso* charsets support?
>
> How would we print a Gopher retreived text document on, for example,
> an older (or mini-mainframe) computer which only uses a Daisy Wheel
> Printer or Teletype Printer (which ONLY supports ASCII characters)?

If the documents (in any of the mentioned encodings) have non-ASCII
characters, the behaviour is undefined (e.g., if the machine ignores the
8th bit, another characters will be rendered instead of the desired ones).

But there's nothing we can do about that, except writing some script to
replace the existing non-ASCII characters with some ASCII description.

Avoiding the use of non-ASCII characters is, of course, a good
idea. But, if there's some document in a non-western language, or a
language which requires another alphabet, it's impossible to use ASCII
in that situation.

<snip/>

-- 
Nuno J. Silva (aka njsg)
LEIC student at Instituto Superior Técnico
Lisbon, Portugal
Homepage: http://njsg.no.sapo.pt/
Gopherspace: gopher://sdf-eu.org/11/users/njsg
Registered Linux User #402207 - http://counter.li.org

-=-=-
Ooh, mommy, mommy, what I have now doesn't work in this extremely
unlikely circumstance, so I'll just throw it away and write something
completely new.
        -- Linus Torvalds




[Prev in Thread] Current Thread [Next in Thread]