Re: [Gimp-developer] Re: your so called optimizations and why we don't like them

degger@xxxxxxx · Tue, 4 Dec 2001 14:17:06 +0100 (CET)

On  3 Dec, Robert L Krawitz wrote:

> By how much?

Depends on the code and the compiler. And the range I'm talking about
is usually between 0 and 50% improvement in both code and size.

> If it can't be measured, it's probably not enough to be
> worthwhile.

Aside from the gains it's IMHO also cleaner. The gcc and linux kernel
people handle it the same way btw.

> And if you use explicit error returns (which may require
> an additional pointer to be passed in), you'll quickly eat up any
> performance gain you might achieve.

No, the tile-manager change I did shows this pretty clear. Although 
using another parameter the object size is smaller here. If I get around
to do so I'll check the difference for a few more architectures here.

> Using signed integers also catches a particularly common error case
> (subtracting a value from a smaller value) that otherwise has to be
> checked for explicitly, which is also going to require more
> instructions (and even worse on modern processors, branches).

I doubt it. If negative numbers are likely to happen then the code
has to check for it anyway and thus the unsignedness of the single
parts of the calculation don't make any difference. And further modern
processors have conditional instructions which can be exactly used
for such kind of expressions and the more simple the check the easier
it is for the compiler to apply them. It is easier to check any type
agains 0 for example than against negativeness and this part also plays
a role when returning 0 or non-null instead of a negative value.

> Anyway, integer arithmetic (additions and subtractions) is usually one
> cycle (plus however long it takes to retrieve the operands and
> instruction from memory, cache, whatever) on modern processors, so
> it's hard for me to see how unsigned arithmetic is going to help.

That's correct, the CPU doesn't care about signedness. But by telling
the compiler that a value is unsigned it can apply additional
optimisation and it also simplifies some calculations considerably.
For example if you shift a signed value it has to generate code to
take care of the sign and similar with some logic operations.

> Fine, but
> 
> a = c - d;
> if (a < 0)
>   abort()
> 
> is likely to be more efficient than
> 
> if (c < d)
>   abort()
> else
>   a = c - d

As this stands it is absolutely correct. However this is only a part of
the picture. The question is: What will happen in the rest of the code?
Also no one would prevent you from making "a" signed if you suspect a
to be negative. There's a lot of code where it is given that some value
is positive for example because it is used in loops which start from 0
all over the place. 

I'm not argueing about signed values in general but about variables
which are by purpose defined to be unsigned.

> And yes, maybe there's a bug in the code, but isn't it better to have
> a reasonably easy way of catching it?  If you put assertions in the
> code, they can be compiled in during debugging, and then turned off
> (thereby not consuming any extra time) when the code is solid.

One may or may not like the way Linus Torvalds codes but his programming
has definitely quite some style and elegance and one of his points is
only to have checks where you expect errors to happen instead of all
over the place. Putting assertions in there to catch problems in the
development phase is definitely a good idea and once the code is
stable they'll ressolve into nops and all that's left is efficient code;
that's how it should work.

> Get some numbers first before worrying about micro-optimizations like
> this.

I would if I could.

> Code that looks simpler isn't necessarily faster, and in

Trust me, if I see the assembly I can tell you which one is faster and
by which magnitude. As you already said most arithmetic calculations 
are one-cycler on modern CPUs and seldomly have sideeffects while
memory accesses in general are slow; if the assembly shows fewer memory
accesses AND fewer calculations while the branches are unaffected there
would have to been serious pipeline issues (because of dependencies) in
the code to make it slower.

> particularly it may not be *significantly* faster.  You're better off
> looking for hot spots in the code; if this isn't a hot spot, it
> doesn't matter what you do.

Oh, paint-funcs are a hotspot, it's quite easy to drag a fast machine
down by painting with a big brush in a transparent layer for instance;
most of the time is then spent in the combine_* functions which is
(as it is now) slow and I'm looking to speed that up by changing the
algorithm; still the basic calculations will remain the same so speeding
them up while brushing up the code is IMHO not a bad idea. Anyway, we're
talking about stuff here that's partially done.

> I've done some stuff like this in Gimp-print (although I haven't found
> unsigned vs. signed to matter very much, unless I can replace a
> division by a power of two by a right shift, in which it's sometimes
> worthwhile to take the branch penalty), but I've spent a lot of time
> with the profiler to actually find the real hot spots.  Simply moving
> constant computations out of loops (into a lookup table, or some such)

Careful, lookup tables can easily trash the caches when done incorrectly
and thus affect the speed of other locations.

> often saves a lot more time than any of this bit fiddling.

The bit fiddling is there to reduce the size of structures which means
that there's less memory to be used for storing them and to access
several bits only one load or store is necessary which can speed up the
code drastically if the structure is an often used one like a tile.

> I agree, but that's exactly why this kind of micro-optimization is
> premature.  You'll get much better results looking for the hot spots
> first, and understanding exactly what's going on and how you can
> devise an overall more efficient approach, than by blindly converting
> signed ints to unsigned and using bit fields.

Actually I wasn't doing it blindly but I was looking for some ways
to reorganise the code to speed up the general flow.

--
Servus,
       Daniel