Complete.Org: Mailing Lists: Archives: freeciv-dev: September 2001:
[Freeciv-Dev] Re: Split patch (was Re: [RFC PATCH] init_techs)
Home

[Freeciv-Dev] Re: Split patch (was Re: [RFC PATCH] init_techs)

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: freeciv-dev@xxxxxxxxxxx (Freeciv developers)
Subject: [Freeciv-Dev] Re: Split patch (was Re: [RFC PATCH] init_techs)
From: Reinier Post <rp@xxxxxxxxxx>
Date: Sun, 30 Sep 2001 23:20:39 +0200

On Fri, Sep 28, 2001 at 10:15:40PM -0400, Ross W. Wetmore wrote:
> At 09:05 AM 01/09/28 +0200, Reinier Post wrote:
> >On Fri, Sep 28, 2001 at 12:00:07AM -0400, Ross W. Wetmore wrote:
> >> split should treat the buffer it was handed as working memory, return
> >> pointers into the parsed string elements, and let the caller deal with
> >> ALL memory issues.
> >> 
> >> It is really the only sensible general purpose solution for something 
> >> like this.
> >
> >Only if you do not have to expand the input while tokenizing it.
> >Then you can change whitespace to NUL a la strtok() either on the
> >original or on a copy.  I think Freeciv will meets this case
> >regardless of the extended command syntax Justin decides on.
> >
> >-- 
> >Reinier
> 
> I think the restriction is that you always have to separate tokens by
> at least some char sized element in the initial buffer.

Yes, this is a restrictition if you want to split the input into tokens
before doing any interpretation on them.  The problem is single-char tokens
that do not have required surrounding whitespace, such as ";".
To circumvent this for ";" and "," you have to distinguish three
separate tokenization stages:  breaking the input into commands (which
will split on ";"), breaking commands into command words (which will
split on whitsepace and possibly "=") and breaking argument words into
subtokens (which will split on ",", possibly ", ", and  possibly other
things).  And this is only when you know that ";" and "," are going
to be the only cases to worry about.

> It should be trivial to write a wrapper, that split, then did something
> like hashed token lookups to return any actual expanded items[] that might 
> be needed for a given parsing algorithm if you need this.

Yes, but the lookup required may depend on context, so you have to
do some syntactica analysis before you can do this.  E.g. at the
moment the followinmg are valid commands:

  set set hard
  set hard hard
  set techs units
  set units hard

> Note this is an example where you don't want to strdup() the initial 
> split() elements, and may not need to do this for the lookup values 
> depending on how the overall algorithm was designed. So extra 
> strdup()/free() calls just waste performance cycles here.

Yes.  I think it may be better to allow each stage to touch the original
buffers (as strtok() does) as long as it undoes them before returning.

> Cheers,
> RossW
> =====

-- 
Reinier


[Prev in Thread] Current Thread [Next in Thread]