Complete.Org: Mailing Lists: Archives: freeciv-dev: April 1999:
[Freeciv-Dev] Some performance issues
Home

[Freeciv-Dev] Some performance issues

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: freeciv-dev@xxxxxxxxxxx
Subject: [Freeciv-Dev] Some performance issues
From: Rizos Sakellariou <rizos@xxxxxxxxxxx>
Date: Sat, 17 Apr 1999 22:48:59 -0500 (CDT)

As many freeciv-ers may have noticed, the time taken by the server
to handle ai's (plus other standard bookkeeping stuff) at the beginning 
(or end if you prefer) of each turn becomes noticeable, especially
as the game progresses and empires grow larger. Some time ago, i did
some profiling of one full turn, on an Ultra Sparc, of a game well 
advanced to locate possible sources of this inefficiency.

Although the profiling results may vary from game to game and, some times, 
some of these costs may be unavoidable, what i got revealed a major problem: 
From the roughly 100 secs that the turn lasted (ie, from load game to
turn done, excluding profiling costs), approximately 14.5 were spent
in multiplications, with a total of 55 million multiplications performed!
These were mainly called from map_get_city() (18M) and map_get_tile() (24M)
to convert the X,Y coordinates to a single value for locating the
corresponding data.

Let me say here that multiplication is still an expensive operation on most 
modern processors with a cost that varies from 10 to 14 CPU cycles in many 
cases. Compilers are typically smart enough to convert multiplications of a 
variable by an integer to a series of ADDs and SHIFTs (thereby reducing
the cost of a MUL), but they are left with no choice in the case of
the multiplication of two unknowns as it is our case here. As a solution,
i suggest to replace all occurrences of y*map.xsize with something like 
an array ymult[y] whose elements (from 0 to map.ysize) are precomputed
at the beginning of each game. Depending on the machine used this may
have a drastical effect on reducing the time spent here.

Other routines of high cost:
- the whole ai_military_attack() : 97msec each call.
- the whole ai_manage_explorer() : 93msec each call.
- the whole find_the_shortest_path() : 69msec each call.

I have little to zero time to take a more serious look into the above
but i hope that this may prompt others to take a go.

regards,
--rizos

PS: I attach the subroutines whose total time of _self_ execution exceeds 
    0.10 secs. The full list is available on request although the experiment 
    can be easily repeated if the code is compiled with the appropriate flags 
    (man prof, or, better, man gprof).

=============================================================================

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 23.79     30.84    30.84                             mcount
 16.49     52.21    21.37                             oldarc
 11.20     66.73    14.52 55995562     0.00     0.00  .mul
  4.87     73.04     6.31  1502034     0.00     0.02  is_my_zoc
  3.88     78.07     5.03 18289911     0.00     0.00  map_get_city
  3.87     83.08     5.01                             next
  3.79     87.99     4.92 24188475     0.00     0.00  map_get_tile
  2.38     91.07     3.08     1125     2.74     6.02  really_generate_warmap
  2.24     93.97     2.90     2793     1.04     1.04  _libc_write
  2.06     96.65     2.68                             done
  1.60     98.73     2.08                             quadruple_to_decimal
  1.53    100.71     1.98 18535603     0.00     0.00  find_genlist_position
  1.52    102.67     1.97 14108344     0.00     0.00  is_enemy_unit_tile
  1.47    104.57     1.90 12723915     0.00     0.00  is_enemy_city_tile
  1.34    106.31     1.74                             moncontrol
  1.09    107.72     1.41      435     3.24    69.59  find_the_shortest_path
  1.00    109.02     1.30 15219814     0.00     0.00  genlist_get
  0.94    110.24     1.22  5650210     0.00     0.00  map_get_terrain
  0.92    111.43     1.19  4082039     0.00     0.00  map_get_known
  0.83    112.51     1.08  1314102     0.00     0.02  could_unit_move_to_tile
  0.79    113.54     1.03 15219812     0.00     0.00  unit_list_get
  0.56    114.27     0.73  1465825     0.00     0.00  .div
  0.56    115.00     0.73                             chainloop
  0.44    115.57     0.57  1660856     0.00     0.00  dir_ok
  0.42    116.11     0.54  3975344     0.00     0.00  unit_flag
  0.42    116.65     0.54      475     1.14     1.74  look_for_charge
  0.39    117.15     0.50  1322016     0.00     0.02  goto_tile_cost
  0.37    117.63     0.48  3535869     0.00     0.00  is_ground_unit
  0.35    118.09     0.46     2245     0.20     0.21  map_clear_special
  0.34    118.53     0.44      177     2.49    92.92  ai_manage_explorer
  0.32    118.95     0.42    20801     0.02     0.02  genlist_size
  0.27    119.30     0.35  1367734     0.00     0.00  get_defender
  0.26    119.64     0.34   929101     0.00     0.00  map_get_continent
  0.26    119.98     0.34  1403142     0.00     0.00  is_tiles_adjacent
  0.25    120.30     0.32  1995758     0.00     0.00  same_pos
  0.22    120.59     0.29    85937     0.00     0.01  _doprnt
  0.22    120.87     0.28  3315789     0.00     0.00  genlist_iterator_init
  0.20    121.13     0.26     1678     0.15     0.15  init_warmap
  0.17    121.35     0.22  1094963     0.00     0.00  add_to_stack
  0.16    121.56     0.21  1109092     0.00     0.00  improvement_exists
  0.16    121.76     0.21   354390     0.00     0.02  tile_is_accessible
  0.15    121.96     0.20                             _sbrk_unlocked
  0.15    122.16     0.20     4001     0.05     0.08  tile_init
  0.15    122.35     0.19  1111014     0.00     0.00  is_friendly_unit_tile
  0.14    122.53     0.19    54254     0.00     0.01  reinforcements_cost
  0.13    122.70     0.17   490296     0.00     0.00  map_get_special
  0.13    122.87     0.17      446     0.38    15.89  find_something_to_kill
  0.12    123.03     0.16    14752     0.01     0.01  is_friendly_city_tile
  0.12    123.19     0.16  1094963     0.00     0.00  get_from_warstack
  0.12    123.34     0.15    27558     0.01     0.02  could_be_my_zoc
  0.12    123.49     0.15    10894     0.01     0.01  unit_list_find
  0.12    123.64     0.15   161236     0.00     0.01  get_total_defense_power
  0.11    123.78     0.14     4179     0.03     0.03  unit_list_init
  0.11    123.92     0.14    90504     0.00     0.01  invasion_funct
  0.10    124.05     0.13   615181     0.00     0.00  
is_transporter_with_free_space
  0.10    124.17     0.13     2011     0.06     0.07  can_unit_move_to_tile
  0.10    124.30     0.13    54274     0.00     0.01  reinforcements_value
  0.09    124.42     0.12     6477     0.02     0.02  is_field_unit
  0.09    124.54     0.12      183     0.66     9.27  assess_danger
  0.08    124.65     0.11   918313     0.00     0.00  is_sailing_unit
  0.08    124.76     0.11   139851     0.00     0.00  strlen
  0.08    124.86     0.11   123483     0.00     0.00  dir_ect
  0.08    124.96     0.10     8287     0.01     0.01  is_heli_unit
  0.08    125.06     0.10        6    16.67    16.67  _open
  0.08    125.16     0.10   488379     0.00     0.00  .umul
  0.08    125.26     0.10    97176     0.00     0.00  get_shields_tile
  0.08    125.36     0.10    17112     0.01     0.13  light_square
  0.07    125.46     0.10      435     0.22     0.22  init_gotomap

[snipped]

 %         the percentage of the total running time of the
time       program used by this function.

cumulative a running sum of the number of seconds accounted
 seconds   for by this function and those listed above it.

 self      the number of seconds accounted for by this
seconds    function alone.  This is the major sort for this
           listing.

calls      the number of times this function was invoked, if
           this function is profiled, else blank.
 
 self      the average number of milliseconds spent in this
ms/call    function per call, if this function is profiled,
           else blank.

 total     the average number of milliseconds spent in this
ms/call    function and its descendents per call, if this 
           function is profiled, else blank.

name       the name of the function.  This is the minor sort
           for this listing. The index shows the location of
           the function in the gprof listing. If the index is
           in parenthesis it shows where it would appear in
           the gprof listing if it were to be printed.

[Prev in Thread] Current Thread [Next in Thread]