Complete.Org: Mailing Lists: Archives: freeciv-dev: January 2001:
[Freeciv-Dev] Profiling Freeciv.
Home

[Freeciv-Dev] Profiling Freeciv.

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: freeciv-dev@xxxxxxxxxxx
Subject: [Freeciv-Dev] Profiling Freeciv.
From: Gaute B Strokkenes <gs234@xxxxxxxxx>
Date: 04 Jan 2001 18:12:06 +0100

Yesterday, I decided to find how profiling works and see if I could do
anything to speed up civserver, which tends to run a bit slowly on my
old 486 during games with large maps and lots of AIs.

So, I recompiled the server with make CFLAGS="-pg -g -a -O2" and
profiled a turn of my current game.  That is, I loaded the game, hit
the end turn button without moving or changing anything, and quit the
server after the end / beginning of turn processing.

For those who are interested, you should be able to find the savegame
that I used at http://www.srcf.ucam.org/~gs234/newworld14.sav.bz2
pretty soon.  Note that this is a big (200 * 100) map with lots of
cities.

This is the (abbreviated) output from gprof:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  9.42    111.76   111.76 46134854     0.00     0.00  map_get_city
  7.84    204.70    92.94     4891     0.02     0.04  really_generate_warmap
  7.40    292.45    87.75 43783222     0.00     0.00  map_get_tile
  6.61    370.89    78.44 34896294     0.00     0.00  map_get_terrain
  6.18    444.17    73.28     1154     0.06     0.39  ai_manage_explorer
  5.13    504.96    60.79 23559318     0.00     0.00  map_get_known
  4.27    555.62    50.66 41563918     0.00     0.00  normalize_map_pos
  3.73    599.87    44.25  2884741     0.00     0.00  is_my_zoc
  2.60    630.69    30.82  1504331     0.00     0.00  invasion_funct
  2.47    659.98    29.29 19415616     0.00     0.00  find_genlist_position
  2.04    684.13    24.15 12499745     0.00     0.00  is_non_allied_unit_tile
  1.99    707.77    23.64     2041     0.01     0.16  find_something_to_kill
  1.75    728.57    20.80 19415495     0.00     0.00  genlist_iterator_init
  1.67    748.41    19.84  4777398     0.00     0.00  get_from_mapqueue
  1.55    766.81    18.40      249     0.07     0.53  auto_settler_findwork
  1.35    782.77    15.96   229328     0.00     0.00  road_bonus
  1.16    796.50    13.73  4772135     0.00     0.00  add_to_mapqueue
  1.14    810.00    13.50  6137906     0.00     0.00  map_get_continent
  1.13    823.40    13.40  3994376     0.00     0.00  map_get_special
  1.01    835.42    12.02  9073073     0.00     0.00  is_ground_unit

[*lots* of stuff snipped]

  0.00   1186.09     0.00        1     0.00     0.02  wipe_unit

 %         the percentage of the total running time of the
time       program used by this function.

cumulative a running sum of the number of seconds accounted
 seconds   for by this function and those listed above it.

 self      the number of seconds accounted for by this
seconds    function alone.  This is the major sort for this
           listing.

calls      the number of times this function was invoked, if
           this function is profiled, else blank.
 
 self      the average number of milliseconds spent in this
ms/call    function per call, if this function is profiled,
           else blank.

 total     the average number of milliseconds spent in this
ms/call    function and its descendents per call, if this 
           function is profiled, else blank.

name       the name of the function.  This is the minor sort
           for this listing. The index shows the location of
           the function in the gprof listing. If the index is
           in parenthesis it shows where it would appear in
           the gprof listing if it were to be printed.

After meditating on these numbers for a while, I decided to try
replacing map_get_city(), map_get_tile() and map_get_terrain() with
defines, which yielded

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  9.52     99.90    99.90     4891     0.02     0.04  really_generate_warmap
  9.40    198.54    98.64  1504331     0.00     0.00  invasion_funct
  8.67    289.57    91.03     1154     0.08     0.33  ai_manage_explorer
  7.38    367.05    77.48  2884741     0.00     0.00  is_my_zoc
  5.10    420.55    53.50 23559372     0.00     0.00  map_get_known
  4.98    472.77    52.22 41563918     0.00     0.00  normalize_map_pos
  3.08    505.11    32.34 19415619     0.00     0.00  find_genlist_position
  2.24    528.60    23.49     2041     0.01     0.14  find_something_to_kill
  1.93    548.86    20.26 19415495     0.00     0.00  genlist_iterator_init
  1.91    568.91    20.05 12499745     0.00     0.00  is_non_allied_unit_tile
  1.78    587.64    18.73   229328     0.00     0.00  road_bonus
  1.76    606.12    18.48  4777398     0.00     0.00  get_from_mapqueue
  1.57    622.56    16.44      249     0.07     0.54  auto_settler_findwork
  1.36    636.83    14.27   359387     0.00     0.00  city_get_food_tile
  1.21    649.52    12.69  3994376     0.00     0.00  map_get_special
  1.20    662.15    12.63  6137906     0.00     0.00  map_get_continent
  1.19    674.66    12.51  7137916     0.00     0.00  unit_flag
  1.13    686.51    11.85  4772135     0.00     0.00  add_to_mapqueue
  1.08    697.82    11.31   409129     0.00     0.00  
city_reinforcements_cost_and_value
  0.90    707.24     9.42   364843     0.00     0.00  city_get_shields_tile

[snip]

  0.00   1049.50     0.00        1     0.00     0.01  wipe_unit

As you can see, making these very small changes shave a whooping 11.5%
of the execution time for this fairly simple example.  While you'd
definitely want to test this with a much wider range of games and
scenarios, it does seem like it might be worthwhile.

Does anyone else do this sort of thing?  My gut feeling is that
there's room for a fair bit of optimisation (including things that are
more clever than replacing functions with defines), though I'm not
really very experienced at this sort of thing, so I might have made
some sort of stupid beginner's mistake that renders all of my numbers
useless...

-- 
Big Gaute                               http://www.srcf.ucam.org/~gs234/
Intra-mural sports results are filtering through th' plumbing...



[Prev in Thread] Current Thread [Next in Thread]