On Sun, Apr 17, 2011 at 9:40 PM, <jcupitt@xxxxxxxxx> wrote: > On 17 April 2011 14:24, Øyvind Kolås <pippin@xxxxxxxx> wrote: >> On my c2d 1.86ghz laptop I get 105s real 41s user with default settings. >> Setting GEGL_SWAP=RAM in the environment to turn off the disk swapping >> of tiles makes it run in 43s real 41s user. > > I found GEGL_SWAP=RAM, but on my laptop the process wandered off into > swap death before finishing. Is there some way to limit mem use? I > only have 2gb. My laptop has 3gb of RAM and thus doesn't end up crunching swap on such a test. Setting GEGL_CACHE_SIZE=1300 or so, should have a similar effect, hopefully GEGL wouldn't need to make everying swap. (not that in doing so you should _not_ set GEGL_SWAP=RAM). I noticed that setting GEGL_THREADS=anything_more_than_1 causes things to crash, along with other things that more subtly break.. are the reason GEGL doesnt default to keep all cores busy yet. >> Loading a png into a tiled buffer as used by GeglBuffer is kind of >> bound to be slow, at the moment GEGL doesnt have a native TIFF loader, > > You can work with tiled tiff straight from the file, but for sadly for > striped tiff (as 90%+ are, groan) you have to unpack the whole file > first :-( I'm not sure what a striped tiff is, if it stores each scanline separately GeglBuffer could be able to load data directly from it by using 1px high tiles that are as wide as the image. >>> babl converts to linear float and back with exp() and log(). Using >>> lookup tables instead saves 12s. >> >> If the original PNG was 8bit, babl should have a valid fast path for >> using lookup tables converting it to 32bit linear. For most other > > OK, interesting, I shall look at the callgrind output again. I'd recommend setting the BABL_TOLERANCE=0.004 environment variable as well, to permit some fast paths with errors around or below 1.0/256 avoiding the rather computationally intensive synthetic reference conversion code in babl. >>> The gegl unsharp operator is implemented as gblur/sub/mul/add. These >>> are all linear operations, so you can fold the maths into a single >>> convolution. Redoing unsharp as a separable convolution saves 1s. >> >> For smaller radiuses this is fine, for larger ones it is not, ideally >> GEGL would be doing what is optimal behind the users back. > > Actually, it works for large radius as well. By separable convolution > I mean doing a 1xn pass then a nx1 pass. You can "bake" the > sub/mul/add into the coefficients you calculate in gblur. I thought you meant hard-coded convultions similar to the crop-and-sharpen example, baking it into the convolution might be beneficial, though at the moment I see it as more important to make sure gaussian blur is as fast as possible since it is a primitive that both this, and dropshadow and other commonly employed compositing things are built from. /Øyvind K. -- «The future is already here. It's just not very evenly distributed» -- William Gibson http://pippin.gimp.org/ ; http://ffii.org/ _______________________________________________ Gegl-developer mailing list Gegl-developer@xxxxxxxxxxxxxxxxxxxxxx https://lists.XCF.Berkeley.EDU/mailman/listinfo/gegl-developer