GPU support

"Richard H." <forums@xxxxxxxxxxxxx> · Sat, 9 May 2009 14:35:14 +0200 (CEST)

Hello,

Thanks for your answer.

>As far as I am concerned, CUDA and ATI's Stream SDK aren't an option
>because they are video card specific.  I haven't really looked into
>them but I'm sure they are easy to use for computations, but the same
>computations can be accomplished through OpenGL through multiple
>rendering/computation passes. Moreover, we decided to go with OpenGL
>instead of a cross-platform API like OpenCL because implementations
>for the latter are still either very young or non-existent for most
>major platforms[1].

Yes, I hope that all needed operations can be done via OpenGL, too because I
have already experienced that CUDA sometimes makes problems even on supported
platforms if you don't have exactly the same compiler options as the samples
etc. On the other side, it's an additional unnecessary layer because the
OpenGL computations use the same things you can directly access via CUDA or
equivalent SDKs (if I have interpreted the infos I read about CUDA correctly).
This layer is optimized for rendering and not for computing, so - from a pure
simplicistic view - CUDA would be easier and more performant than OpenGL.

I have never heard of OpenCL, but it seems to be very interesting (and not
proprietary). I hope that it will be available for the most important
platforms, soon.

>Sorry, but I'm not really familiar with CUDA's threading model.
Basically it's just the same as normal CPU threads, which would IMO be
important to support, too. Many (especially Linux) systems don't have graphics
acceleration available, but more than one CPU. Today you get already 8 cores,
i.e. filters would be ~8 times faster if they utilised all CPUs (for at least
8 available rectangles).

>Furthermore, GEGL doesn't really take care of thread creation and
>management.

I have seen this. But isn't that bad? My opionion is that it should.

> As far as I can see, your code's intent is to divide the
>tasks into smaller packets of rectangle.  GEGL already has GEGL
>processors that can be used to accomplish this[2]. In fact,
>internally, GEGL uses tiles to divide the image into smaller packets
>for on demand rendering and caching.

Yes I have found the relevant code sections in the GEGL source (at least I
think that I have ;) ), and they seem to have the linear structure I have
mentioned in my first email. I mean that gegl_processor_work() returns a
boolean telling you if there is some more work to do, in which case you have
to call it again. I think this is not usable for parallelism, because you
always have to wait until the current packet (processed by
gegl_processor_work()) is finished until you know if you have to call it
again. For parallelism, a better approach would be that gegl_process_work()
does everything at once, for instance by moving the outer while loop over the
rectangles into in inner for loop.

I just wonder if that would help for the pure OpenGL approach, too. But I
think that processing with OpenGL _and_ multithreading would make the best of
pure OpenGL acceleration as well.

I will have a look at the articles on gpgpu.org, too, seems to be a very
interesting site, thanks.

-- 
Richard H. (via www.gimpusers.com)
_______________________________________________
Gegl-developer mailing list
Gegl-developer@xxxxxxxxxxxxxxxxxxxxxx
https://lists.XCF.Berkeley.EDU/mailman/listinfo/gegl-developer