Re: [Gimp-developer] Re: Re: Re: GIMP and multiple processors

Daniel Egger <de@xxxxxxxxxx> · Mon, 7 Mar 2005 12:16:47 +0100

On 03.03.2005, at 20:09, GSR - FR wrote:

With your idea, calculating the full 3000*3000 with a depth of 3 is
like calculating 9000*9000 (81 million pixels in RGB, 243*10^6 bytes
plus overhead) and in time it should be 9 times the 3000*3000 non
adaptive version plus the scale operation. To avoid absurd memory
usage, the code will have to be more complex than just render big and
then scale down. It could sample multiple planes and average (9
stacked tiles, each with a small offset for the gradient sampling).

Huh? In my eyes the code would be *very* simple. Instead of
an allocation of a 9 times size the image of the sampled version
I'd rather allocate the sample tiles for the work area per thread
So, to stay with this example, for the rendering of 1 tile one
will need the memory for the 1 permanent tile plus 9 temporary
ones that will be reused after supersampling.

Current adaptive is not paralel but the algorithm, at the logic level,
is paralelizable in tiles, or groups of tiles to not waste so much in
edges.

I don't see how but that would be a good start.

So I did some rough tests, 2000*2000 with adaptive vs 6000*6000
without adaptive (9000 was too much for my computer, so tried 2 and 6,
same 1:3 ratio and still big). Small with adaptive was 10.3 sec and
big without adaptive was 9.6 sec for linear black to white from one
corner to another or side to side.

Your idea does not seem to be always faster, not approaching the 10x
magical "order of magnitude" in many cases but 3x in extreme ones and
a big memory hog if done naively. Only cases in which it is faster are
when adaptive has to calculate all the samples, due the test overhead
being a complete waste.

Apart from your machine obviously being completely different to
mine your comparison is neither fair nor even close to correct.
Although memory bandwidth is plenty and cache are big nowadays
an approach using a magnitude more memory in some non-efficient
way will lose hands down against some less efficient algorithm
on a much smaller workarea.

it avoid checks. When you want oversampling, adaptive one is faster in
many cases than full sampling, otherwise it would have been silly to
design and code it in first instance.

Interesting conclusion.

So please, apples to apples and oranges to oranges.

Yes, please.

If I weren't so short of time I would simply remove gobs of crufty
uncomprehensible code, reuse the current code for a parallel full
supersampling idea, simply to prove you wrong. On top of this it
should be pretty simple to readd the adaptiveness for another large
speed gain.

Servus,
      Daniel
Attachment:
PGP.sig

Description: This is a digitally signed message part