Sven Neumann <sven@xxxxxxxx> wrote: > __u32 __rb = (((color.r)<<16) | (color.b)); > __u32 __g = ((color.g)<<8); > > switch (a) {\ > case 0xff: *(d) = (0xff000000 | __rb | __g); \ > case 0: break; \ > default: {\ > __u32 pixel = *(d);\ > __u16 s = (a)+1;\ > register __u32 t1,t2; \ > t1 = (pixel&0x00ff00ff); t2 = (pixel&0x0000ff00); \ > pixel = ((((__rb-t1)*s+(t1<<8)) & 0xff00ff00) + \ > ((( __g-t2)*s+(t2<<8)) & 0x00ff0000)) >> 8; \ > *(d) = pixel;\ > }\ > } > >if you think this looks ugly, you should have a look at the same >function for RGB16 and RGB15 ;-) I don't think they are that bad --- the readability of the above code merely suffers from a pollution of backslashes and underscores. But the general principle is useful and it's not hard to do parallel saturating additions and subtractions without any branches at all, just using bit fiddling. Many modern architectures can do better with vector instructions but generic fallback code is of course always needed