Complete.Org: Mailing Lists: Archives: freeciv-dev: September 2001:
[Freeciv-Dev] Re: [RFC PATCH] init_techs
Home

[Freeciv-Dev] Re: [RFC PATCH] init_techs

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Freeciv developers <freeciv-dev@xxxxxxxxxxx>
Subject: [Freeciv-Dev] Re: [RFC PATCH] init_techs
From: Justin Moore <justin@xxxxxxxxxxx>
Date: Wed, 26 Sep 2001 01:53:35 -0400 (EDT)

> > > An aside: a standard way to denote words with non-alphanumerics must
> > > be picked.  I don't care what it is, but adding the ability to interpret
> > > the command line above leads to problems.
> >
> >    Feel free to suggest a system.  I just write the parsing code around
> > here (and occasionally serve as village idiot).
>
> (Are you asking me to move to the next village?)

   Nah, I can go back to being the village idiot on Beowulf. :)

> Words on the command line are separated with whitespace.
> Except in the /set option=value case.

/* From within the set_command function */
char *args[2];
split("\S=", buf, args, 2);
/* args[0] = "option", args[1] = "value"  :) */

> Some identifiers can contain spaces: directory names, proper names,
> but no identifier can begin or end with a space.

   That's why I wanted to make the distinction to remove/split surrounding
whitespace ("\s") and remove/split all whitespace ("\S").

> Proper names can contain [-'.], and possibly non-ASCII characters
> ("Valéry Giscard d'Estaing", "Dr. Jekyll-Hyde"), but never =.
> Some other words can contain other kinds of characters.  Unix directory
> names, for instance, can contain every character except / and \0.
>
> One approach to have a system in which all strings can be represented
> as words on the command line.  One way to do it is to use " as a
> special quote character:
>
>   /set attribute nation.Burgundians.techs Bouillabaisse,"Hors d'Oeuvre"
>
> Then some way has to be found to denote the empty string, and
> the " character itself.  One way is to let "" mean
>
>   - the " character,  after an odd number of "s
>   - the empty string, after an even number of "s
>
> An alternative is to use an escape character, eg. \ as in Unix shells.

   I think this would be the simpler way to do it.

> Another problem is special metanotation.  For example, it must be
> possible to denote "all players". "ALL" and "*" are obvious choices.
> Whatever the choice, it must be possible to distinguish this notation
> from a player name "*" or "ALL".  The way it which this is done must be
> general (not specific to player names).  Furthermore, the system must
> be extensible: new metanotations may be introduced later on.  I think
> the simplest way to deal with it is to keep within the Freeciv code a
> list of the special metanotations and the version of the language in
> which they appeared, and write the language version number to each file
> in which the language is used (just like it is already done with rulesets).

   IMHO, the "*" character would be the best choice.  Anyone that names
their character "*" is probably looking to see what will break. :)

> So my proposal is to use whitespace as a word separator, to always allow
> any identifier to be quoted with " (which, when used on metanotation, will
> escape its special meaning), and to denote the literal " character within
> an identifier as "".
>
> I think this is good enough, but I'm not sure.  There is no convention
> for continuing a command line or even an identifier on the next line.
> Using the full C syntax for strings may be a better idea.

   I think a generic get_next_command_line would be good.  It could ignore
lines whose first non-whitespace character was a '#' or '\n', and append
the next line to those lines that ended in a '\\'.

> > > Don't you think all commands should be parsed with standard functions that
> > > perform typechecking?
> >
> >    I do, but I think they should be in the cmd_* functions, not in the
> > parsing code.  See my previous e-mails/rants as to why.
>
> I think syntactic feedback ("argument %d of command %s cannot be an %s")
> is more informative than semantic feedback (an error message that takes
> advantage of the actual meaning of the command, but omits syntactic detail).
> The reader wants to debug the command.  So if the messages are passed to
> the cmd_* functions the syntactic information should at least be
> passed along.  It isn't at the moment.

   With my system we'd know inside the function what chunk it barfed on.
Each command could call

char* check_types(CMD, buf, WHAT_DO_I_WANT);

   where CMD and WHAT_DO_I_WANT are some sort of enum/#defines.  It would
return a strdup of the original args passed to the command, which would be
freed at the end of the function.  An error message could get spit out:

printf("Unexpected or invalid input: \"%s\" in command \"%s\".\n%s",
   args[bad_arg], orig_buf, usage_example[CMD]);

I find a sample usage to be much clearer than "I wanted an 'int', you
idiot." :) Plus if a programmer wants to change the kind of data their
command accepts, they just need to change it in one place.

-jdm

Department of Computer Science, Duke University, Durham, NC 27708-0129
Email:  justin@xxxxxxxxxxx



[Prev in Thread] Current Thread [Next in Thread]