Hello, Thanks for your answer. >As far as I am concerned, CUDA and ATI's Stream SDK aren't an option >because they are video card specific. I haven't really looked into >them but I'm sure they are easy to use for computations, but the same >computations can be accomplished through OpenGL through multiple >rendering/computation passes. Moreover, we decided to go with OpenGL >instead of a cross-platform API like OpenCL because implementations >for the latter are still either very young or non-existent for most >major platforms[1]. Yes, I hope that all needed operations can be done via OpenGL, too because I have already experienced that CUDA sometimes makes problems even on supported platforms if you don't have exactly the same compiler options as the samples etc. On the other side, it's an additional unnecessary layer because the OpenGL computations use the same things you can directly access via CUDA or equivalent SDKs (if I have interpreted the infos I read about CUDA correctly). This layer is optimized for rendering and not for computing, so - from a pure simplicistic view - CUDA would be easier and more performant than OpenGL. I have never heard of OpenCL, but it seems to be very interesting (and not proprietary). I hope that it will be available for the most important platforms, soon. >Sorry, but I'm not really familiar with CUDA's threading model. Basically it's just the same as normal CPU threads, which would IMO be important to support, too. Many (especially Linux) systems don't have graphics acceleration available, but more than one CPU. Today you get already 8 cores, i.e. filters would be ~8 times faster if they utilised all CPUs (for at least 8 available rectangles). >Furthermore, GEGL doesn't really take care of thread creation and >management. I have seen this. But isn't that bad? My opionion is that it should. > As far as I can see, your code's intent is to divide the >tasks into smaller packets of rectangle. GEGL already has GEGL >processors that can be used to accomplish this[2]. In fact, >internally, GEGL uses tiles to divide the image into smaller packets >for on demand rendering and caching. Yes I have found the relevant code sections in the GEGL source (at least I think that I have ;) ), and they seem to have the linear structure I have mentioned in my first email. I mean that gegl_processor_work() returns a boolean telling you if there is some more work to do, in which case you have to call it again. I think this is not usable for parallelism, because you always have to wait until the current packet (processed by gegl_processor_work()) is finished until you know if you have to call it again. For parallelism, a better approach would be that gegl_process_work() does everything at once, for instance by moving the outer while loop over the rectangles into in inner for loop. I just wonder if that would help for the pure OpenGL approach, too. But I think that processing with OpenGL _and_ multithreading would make the best of pure OpenGL acceleration as well. I will have a look at the articles on gpgpu.org, too, seems to be a very interesting site, thanks. -- Richard H. (via www.gimpusers.com) _______________________________________________ Gegl-developer mailing list Gegl-developer@xxxxxxxxxxxxxxxxxxxxxx https://lists.XCF.Berkeley.EDU/mailman/listinfo/gegl-developer