[Freeciv-Dev] Some performance issues
[Top] [All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
As many freeciv-ers may have noticed, the time taken by the server
to handle ai's (plus other standard bookkeeping stuff) at the beginning
(or end if you prefer) of each turn becomes noticeable, especially
as the game progresses and empires grow larger. Some time ago, i did
some profiling of one full turn, on an Ultra Sparc, of a game well
advanced to locate possible sources of this inefficiency.
Although the profiling results may vary from game to game and, some times,
some of these costs may be unavoidable, what i got revealed a major problem:
From the roughly 100 secs that the turn lasted (ie, from load game to
turn done, excluding profiling costs), approximately 14.5 were spent
in multiplications, with a total of 55 million multiplications performed!
These were mainly called from map_get_city() (18M) and map_get_tile() (24M)
to convert the X,Y coordinates to a single value for locating the
corresponding data.
Let me say here that multiplication is still an expensive operation on most
modern processors with a cost that varies from 10 to 14 CPU cycles in many
cases. Compilers are typically smart enough to convert multiplications of a
variable by an integer to a series of ADDs and SHIFTs (thereby reducing
the cost of a MUL), but they are left with no choice in the case of
the multiplication of two unknowns as it is our case here. As a solution,
i suggest to replace all occurrences of y*map.xsize with something like
an array ymult[y] whose elements (from 0 to map.ysize) are precomputed
at the beginning of each game. Depending on the machine used this may
have a drastical effect on reducing the time spent here.
Other routines of high cost:
- the whole ai_military_attack() : 97msec each call.
- the whole ai_manage_explorer() : 93msec each call.
- the whole find_the_shortest_path() : 69msec each call.
I have little to zero time to take a more serious look into the above
but i hope that this may prompt others to take a go.
regards,
--rizos
PS: I attach the subroutines whose total time of _self_ execution exceeds
0.10 secs. The full list is available on request although the experiment
can be easily repeated if the code is compiled with the appropriate flags
(man prof, or, better, man gprof).
=============================================================================
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
23.79 30.84 30.84 mcount
16.49 52.21 21.37 oldarc
11.20 66.73 14.52 55995562 0.00 0.00 .mul
4.87 73.04 6.31 1502034 0.00 0.02 is_my_zoc
3.88 78.07 5.03 18289911 0.00 0.00 map_get_city
3.87 83.08 5.01 next
3.79 87.99 4.92 24188475 0.00 0.00 map_get_tile
2.38 91.07 3.08 1125 2.74 6.02 really_generate_warmap
2.24 93.97 2.90 2793 1.04 1.04 _libc_write
2.06 96.65 2.68 done
1.60 98.73 2.08 quadruple_to_decimal
1.53 100.71 1.98 18535603 0.00 0.00 find_genlist_position
1.52 102.67 1.97 14108344 0.00 0.00 is_enemy_unit_tile
1.47 104.57 1.90 12723915 0.00 0.00 is_enemy_city_tile
1.34 106.31 1.74 moncontrol
1.09 107.72 1.41 435 3.24 69.59 find_the_shortest_path
1.00 109.02 1.30 15219814 0.00 0.00 genlist_get
0.94 110.24 1.22 5650210 0.00 0.00 map_get_terrain
0.92 111.43 1.19 4082039 0.00 0.00 map_get_known
0.83 112.51 1.08 1314102 0.00 0.02 could_unit_move_to_tile
0.79 113.54 1.03 15219812 0.00 0.00 unit_list_get
0.56 114.27 0.73 1465825 0.00 0.00 .div
0.56 115.00 0.73 chainloop
0.44 115.57 0.57 1660856 0.00 0.00 dir_ok
0.42 116.11 0.54 3975344 0.00 0.00 unit_flag
0.42 116.65 0.54 475 1.14 1.74 look_for_charge
0.39 117.15 0.50 1322016 0.00 0.02 goto_tile_cost
0.37 117.63 0.48 3535869 0.00 0.00 is_ground_unit
0.35 118.09 0.46 2245 0.20 0.21 map_clear_special
0.34 118.53 0.44 177 2.49 92.92 ai_manage_explorer
0.32 118.95 0.42 20801 0.02 0.02 genlist_size
0.27 119.30 0.35 1367734 0.00 0.00 get_defender
0.26 119.64 0.34 929101 0.00 0.00 map_get_continent
0.26 119.98 0.34 1403142 0.00 0.00 is_tiles_adjacent
0.25 120.30 0.32 1995758 0.00 0.00 same_pos
0.22 120.59 0.29 85937 0.00 0.01 _doprnt
0.22 120.87 0.28 3315789 0.00 0.00 genlist_iterator_init
0.20 121.13 0.26 1678 0.15 0.15 init_warmap
0.17 121.35 0.22 1094963 0.00 0.00 add_to_stack
0.16 121.56 0.21 1109092 0.00 0.00 improvement_exists
0.16 121.76 0.21 354390 0.00 0.02 tile_is_accessible
0.15 121.96 0.20 _sbrk_unlocked
0.15 122.16 0.20 4001 0.05 0.08 tile_init
0.15 122.35 0.19 1111014 0.00 0.00 is_friendly_unit_tile
0.14 122.53 0.19 54254 0.00 0.01 reinforcements_cost
0.13 122.70 0.17 490296 0.00 0.00 map_get_special
0.13 122.87 0.17 446 0.38 15.89 find_something_to_kill
0.12 123.03 0.16 14752 0.01 0.01 is_friendly_city_tile
0.12 123.19 0.16 1094963 0.00 0.00 get_from_warstack
0.12 123.34 0.15 27558 0.01 0.02 could_be_my_zoc
0.12 123.49 0.15 10894 0.01 0.01 unit_list_find
0.12 123.64 0.15 161236 0.00 0.01 get_total_defense_power
0.11 123.78 0.14 4179 0.03 0.03 unit_list_init
0.11 123.92 0.14 90504 0.00 0.01 invasion_funct
0.10 124.05 0.13 615181 0.00 0.00
is_transporter_with_free_space
0.10 124.17 0.13 2011 0.06 0.07 can_unit_move_to_tile
0.10 124.30 0.13 54274 0.00 0.01 reinforcements_value
0.09 124.42 0.12 6477 0.02 0.02 is_field_unit
0.09 124.54 0.12 183 0.66 9.27 assess_danger
0.08 124.65 0.11 918313 0.00 0.00 is_sailing_unit
0.08 124.76 0.11 139851 0.00 0.00 strlen
0.08 124.86 0.11 123483 0.00 0.00 dir_ect
0.08 124.96 0.10 8287 0.01 0.01 is_heli_unit
0.08 125.06 0.10 6 16.67 16.67 _open
0.08 125.16 0.10 488379 0.00 0.00 .umul
0.08 125.26 0.10 97176 0.00 0.00 get_shields_tile
0.08 125.36 0.10 17112 0.01 0.13 light_square
0.07 125.46 0.10 435 0.22 0.22 init_gotomap
[snipped]
% the percentage of the total running time of the
time program used by this function.
cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.
self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.
calls the number of times this function was invoked, if
this function is profiled, else blank.
self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.
total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.
name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.
|
|