Complete.Org: Mailing Lists: Archives: freeciv-dev: April 1999:
Re: [Freeciv-Dev] Some performance issues
Home

Re: [Freeciv-Dev] Some performance issues

[Top] [All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
To: freeciv-dev@xxxxxxxxxxx
Subject: Re: [Freeciv-Dev] Some performance issues
From: Fabrice Noilhan <Fabrice.Noilhan@xxxxxx>
Date: Mon, 19 Apr 1999 11:03:44 +0200

On Sat, Apr 17, 1999 at 10:48:59PM -0500, Rizos Sakellariou wrote:
> Let me say here that multiplication is still an expensive operation on most 
> modern processors with a cost that varies from 10 to 14 CPU cycles in many 
> cases. Compilers are typically smart enough to convert multiplications of a 
> variable by an integer to a series of ADDs and SHIFTs (thereby reducing
> the cost of a MUL), but they are left with no choice in the case of
> the multiplication of two unknowns as it is our case here. As a solution,
> i suggest to replace all occurrences of y*map.xsize with something like 
> an array ymult[y] whose elements (from 0 to map.ysize) are precomputed
> at the beginning of each game. Depending on the machine used this may
> have a drastical effect on reducing the time spent here.

Since I'm working on this subject (cryptography and modular arithmeric),
I think I can have some useful comments:

- integer multiplication is an expensive operation on UltraSparc
processors (20 cycles for a 32*32 -> 64 bits operation). On Pentium
Pro/II/III, it only takes 4 cycles. It is really faster (in terms of
cycles) on old sparc processors (4-6 cycles).

- Sun's C compiler does NOT replace multiplications by a constant integer
by shifts and ADDS (even if it is stated in the documentation)

- Floating point operations are really faster on UltraSparc (26*26->52)
but the cost of conversion (using casts or assembly code) is really huge.
Anyway, you can do it properly if you know that you're handling integers
with appropriate substraction and masks.

- The cost of a lookup-table is similar to the cost of a multiplication
on most other processors. This table should be tiny if you want to stay
in the cache.

Now, it's up to you to decide if you want portability (in case you can
just leave these multiplications), efficiency (in case you can define
a macro for doing these multiplications) or use the not-most-efficient-but-
cool-on-all-processors solution (lookup table).

        Fabrice

[Prev in Thread] Current Thread [Next in Thread]