Re: [Gimp-developer] GIMP and multiple processors

Jay Cox <jaycox@xxxxxxxx> · Sun, 27 Feb 2005 14:28:47 -0800

On Sun, 2005-02-27 at 01:04 +0100, Sven Neumann wrote:
> Hi again,
> 
> Jay Cox <jaycox@xxxxxxxx> writes:
> 
> > Clearly the gradient code could use some tuning.  A linear blend
> > shouldn't take much more than 1/2 a second even with dithering.
> 
> Could we improve this by preparing a reasonably fine array
> of samples and picking from these samples instead of calling
> gimp_gradient_get_color_at() over and over again?

That is one obvious optomization.

Other obvious optomizations:

  The dither code is way too complex.  It looks like it should boil down
to: color.{r,g,b,a} += g_rand_int()/RAND_MAX.

  We shouldn't need 32 bits of random data per component.  8 bits should
do, so we only need one call to g_rand_int per pixel.

  For linear blends we should calculate delta values for the factor
variable.  This will allow us to calculate the factor for each pixel
with only one add (plus edge condition checks).

> > The reason why the dithering case gets less of a speedup is because
> > the threads are fighting over the GRand state.  Each thread needs to
> > have it's own GRand state.
> >
> > It looks like the threads are also fighting over
> > gradient->last_visited.  My guess is that fixing this will get us
> > much closer to the ideal 2x speed up.
> 
> I have eliminated last_visited from the gradient struct. Instead the
> caller of gimp_gradient_get_color_at() may now do the same
> optimization without any caching in the gradient itself. I very much
> doubt that this makes any difference though. Perhaps if you would
> benchmark a gradient blend with a lot of segments. But in the general
> case it's just a few of them, very often even only a single one.
> 
> Now that this race condition is eliminated I might look into adding
> hooks to the pixel-processor to allow initialisation of per-thread
> data, like for example a GRand.

I think that is the correct way to do it.  It should be done generaly
enough so that the histogram code can be moved over to use the
pixel_region_process_parallel functions.

Jay Cox
jaycox@xxxxxxxx