Re: [Freeciv-Dev] Some performance issues
[Top] [All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index] [Thread Index]
>
> On Sat, Apr 17, 1999 at 10:48:59PM -0500, Rizos Sakellariou wrote:
> > Let me say here that multiplication is still an expensive operation on most
> > modern processors with a cost that varies from 10 to 14 CPU cycles in many
> > cases. Compilers are typically smart enough to convert multiplications of a
> > variable by an integer to a series of ADDs and SHIFTs (thereby reducing
> > the cost of a MUL), but they are left with no choice in the case of
> > the multiplication of two unknowns as it is our case here. As a solution,
> > i suggest to replace all occurrences of y*map.xsize with something like
> > an array ymult[y] whose elements (from 0 to map.ysize) are precomputed
> > at the beginning of each game. Depending on the machine used this may
> > have a drastical effect on reducing the time spent here.
>
> Since I'm working on this subject (cryptography and modular arithmeric),
> I think I can have some useful comments:
>
> - integer multiplication is an expensive operation on UltraSparc
> processors (20 cycles for a 32*32 -> 64 bits operation). On Pentium
> Pro/II/III, it only takes 4 cycles. It is really faster (in terms of
> cycles) on old sparc processors (4-6 cycles).
>
> - Sun's C compiler does NOT replace multiplications by a constant integer
> by shifts and ADDS (even if it is stated in the documentation)
>
> - Floating point operations are really faster on UltraSparc (26*26->52)
> but the cost of conversion (using casts or assembly code) is really huge.
> Anyway, you can do it properly if you know that you're handling integers
> with appropriate substraction and masks.
>
> - The cost of a lookup-table is similar to the cost of a multiplication
> on most other processors. This table should be tiny if you want to stay
> in the cache.
>
> Now, it's up to you to decide if you want portability (in case you can
> just leave these multiplications), efficiency (in case you can define
> a macro for doing these multiplications) or use the not-most-efficient-but-
> cool-on-all-processors solution (lookup table).
Ok - let's get some things straight:
- Portability is defined as whether your code can be easily compiled
and run on different architectures. Changing y * map.xsize with
something like yoffset[y] for instance is not a machine dependent
operation and has nothing to do with portability.
- Efficiency is associated with the performance of the code which
results from the code that the compiler generates. In the particular
case, there is scope of replacing a generally expensive multiplication
with a not so expensive load.
Finally: Performance tuning, especially for a variety of platforms,
is a tedious process; if we also take into account the complexity
of modern processors all this implies that one has to be investigative
rather than authoritative.
--rizos
|
|