Complete.Org: Mailing Lists: Archives: freeciv-ai: June 2003:
[freeciv-ai] Re: [Fwd: Re: Re: learning from experience]

[freeciv-ai] Re: [Fwd: Re: Re: learning from experience]

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: Jeremy Adams <caveman@xxxxxxxxxxx>
Cc: freeciv-ai@xxxxxxxxxxx
Subject: [freeciv-ai] Re: [Fwd: Re: Re: learning from experience]
From: "Alexander L. Vasserman" <avasserm@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 11 Jun 2003 12:44:52 -0400 (EDT)

I think that learning from human players is a great idea. This would
address a lot of issues. It would certainly provide for a tighter feedback
loop, so the initial learning can go a lot faster. It would also seem to
simplify the learning algorithms that could be used, as it would not need
to propogate as much information from the end of the game to each
decision. Also it would give the AI an idea of what to do in the
situations to which current AI can't get to. And data collection can start
before an explicite decision on the learning algorithm is made, and can
be used for multiple implementations of off-line learning. The online
learnign which might be harder to do can come after the off-line learning
is successfully implemented.

I think that the first step would be to implement data collection. (Maybe
some of the current developers could help out to get this started). This
data would be usefull on it's own even if we don't implement learning, as
it would show we people actually do when we play the game, rather then
what we think we would have done if charged with the task of a decision at
each step. Surprisingly enough, the 2 have been shown to be fairly
different for many other tasks, and I am guessing it would be different
here as well. While I think the data is usefull on it's own, learning is
the way to go. It not just would probably provide for a better AI now, but
as more features will become available in the game, it would provide a
mechanism for AI to learn how to use these features without a huge
development effort and without figuring out which rules for the use of
these new features to hard code. It will also make freeciv unique among
current games, and would exploite all the advantages of Free development,
as I would think that more people would be likely to send their data in
then for comertial games.

How about some feed back from current developers?


Alex Vasserman.

On Tue, 10 Jun 2003, Jeremy Adams wrote:

> Sorry, my original email below didn't go directly to the list... i'll
> try that again... ;)
> I thought of this issue as well, and although I haven't provided any
> code so far to Freeciv, my thoughts on this were thus:
> If a "learning AI" were implemented, it would best be implemented by
> provided an external "trainer" program that can track and statistically
> analyze the research trends, build trends, tax/science/luxury adjustment
> trends, and military trends as it tracks several human players playing
> against the in-game AI, then some more precise values and equations
> could be determined that would more accurately match the human's game
> strategy which could then be plugged back into the existing AI code. The
> "trainer" program could watch the in-game AI play itself to refine it's
> equation values to more equally match the values of the in-game winner, etc.
> No small feat for sure to program such an advanced system, but
> "training" the AI in this way would also make the game at least playable
> since there would be some amount of "training" that the AI had gone
> through to act in the way it would in-game, but at the same time no
> single 'illogical' move by a human player would break the in-game AI as
> it's values would have already been determined and hard-coded and
> therefore not subject to the sometimes inconsistent turn-by-turn tactics
> of the human player. (This would also allow some amount of 'playability'
> tweaking so that the AI was not an impossible to defeat foe for newbies)
> -Jeremy
> Ross Wetmore wrote:
> >
> > Per I. Mathisen wrote:
> > [...]
> >
> >> In any case, there is a problem: Changing context. A strategy that works
> >> fine in, say, gen 1, may not work as well in gen2. Some slight
> >> changes in
> >> server options might change context enough to throw its accumulated
> >> weights into question.
> >
> >
> > The problem with this is one of poor implementation. The AI needs a
> > feedback
> > loop that can update weights or switch strategies (alternate way of
> > phrasing
> > update weights) on a timeframe less than a game or release cycle.
> >
> >> What you describe would be quite useful in order to optimize the AI. We
> >> could put researched weights into the rulesets. However, I am
> >> pessimistic
> >> about the state of AI research when it comes to being able to write
> >> an AI
> >> that can figure out on the fly and on its own new strategies and read
> >> something meaningful into accumulated statistical data. But I would be
> >> happy to be proven wrong.
> >>
> >>   - Per
> >
> >
> > Again, if the implementation is explicitly coded in fine detail, then
> > you are
> > correct that this is going to be a total failure.
> >
> > The solution is more successful if it is developed using a fuzzy feedback
> > flavour. Small tactical operations might be handled in a micro-managed
> > fashion, but overall strategy should not be explicitly planned. Rather it
> > should be handled as a weighted selection of responses to changing game
> > state where feedback parameters adjust the weighting selection on a
> > chosen
> > timescale.
> >
> > At any given moment, one makes the best selection on the instantaneous
> > set
> > of strategic weights. But over a longer period, one adjusts or
> > reselects the
> > strategic imperatives based on events/needs/success/failure feedback
> > decisions.
> >
> > Cheers,
> > RossW
> > =====
> >
> >
> >

[Prev in Thread] Current Thread [Next in Thread]