On Mon, 11 Aug 2003, Guillermo S. Romero / Familia Romero wrote: > rock@xxxxxxxx (2003-08-08 at 1801.54 -0700): > > Portable XCF would use a chunk system similar to PNG, with two major > > differences. First, chunk type would be a string instead of a 32-bit > > value. Second, chunks can contain an arbitrary number of subchunks, which > > of course can contain subchunks themselves. > > PNG 32 bit names are char... or at least all them can be read. :] And > I think the purpose of this was, among other ideas: easy to parse > (always four chars) and makes sense with some rules about chars (caps > vs normal). Even the magic of PNG had a reasoning (part binary to > avoid confusion with text and capable of detecting non 8 bit > transmision or bad byte order). IOW, why not make it similar, but just > bigger (four char for name space and 12 more for function)? Arbitrary > size strings does not seem a good idea to me. This seems like a good proposal. > Another thing, alignment (and thus padding), is worth the problems it > could cause? If the format has to be fast, maybe this should be taken > into account, and not only about small sizes in memory (ie 32 bit), > but maybe disks (ie blocks) or bigger sizes in memory (ie pages) too. > Would the format be used just as storage, or would it be used as > source / destination when memory is scarce. Remember that some apps > are capable of working in areas instead of the full image, to improve > global troughput. Right. To be mmappable, the format should be aligned. I think with careful design, there won't be too much overhead from this. When I wrote that the example was just a rough sketch, part of what I meant was that I didn't pay too much attention to bit sizes and alignment, because that would have been premature optimization. One issue with alignment is which platform's alignement rules should be used. I think a good common-denominator format can be found. It won't get the wierd ones, of course. I work on a Cray, and nothing follows cray's alignment rules. :) > > image data chunks should use png-style adaptive predictive compression. > > They should also use adam-7. > > I would avoid compression inside the format. Files can be compressed > as a whole It does complicate in-place image manipulation, true. OTOH, you can get much better lossless compression using image-specific techniques such as predictive compression than you can using general purpose techniques. > and IIRC Adam7 is about interlacing, not compression, > dunno why an editor should do progressive load. Load smaller res in > case of problem? I would try to avoid that instead of try to fix it, > with proper storage and transmission. Load with proxy images? Too > rough, IMO, it is not a scaled down version. Well, working a scaled-down version of large files is an important optimization. It's true that not all image manipulation functions can credibly be approximated with working on a scaled-down version, but that's for the gegl people to worry about. My guess is that it will be easier to use interlaced data than true scaled-down images, and the savings in terms of computational time and pipeline flexablity will be worth it. > PNG compression is the one provided by zlib PNG's use zlib compression on the overall file, but the entropy is first significanty reduced by using predictive encoding. It's not the same as just running gzip on raw data. > and I can show you cases in which other compressors have done a better > job with my XCF files (anybody can try bzip2), and if computers keep > evolving the same way, the extra CPU load is better than the disk or > network transfer. True. > Letting other apps do it means those apps could be general, reducing > work load. Of course, but we should not sacrifice functionality for convenience. :) > Or better, custom, but once the "look" of the data is well > known and there is plenty of test cases (like FLAC but for XCF2, > compression targeted at some kind of patterns). Conformance testing is very important. That is a good idea. > Realize too that this links to aligment things, if you know that a layer > is always somewhere and requires X MB, you can overwrite and reread > without problems. This will have to be worked out. > > > > CHUNK: > > chunk start, optional - 2 byte bitmask with some png-like flags > > "xcf-comment" > > total size of chunk and subchunks - 4 bytes > > size of chunk - 4 bytes > > For all these sizes... why not 64 and be avoid future problems? If > someone likes it and uses it for really big things, segmentation is a > negative point. Or maybe force a small max size for each chunk > (forcing segmentation) which would give more CRCs. Options, options, > options... Both have their plusses and minuses. > > "This is the comment" > > chunk end (flags) - 2 bytes > > "xcf-comment" > > 1 (subchunk depth) - 1 byte > > crc32 - 4bytes > [...] > > I would add unique chunk ID to each, so then can make references. Good idea. > So of your list of items, 1 (lossless), 2 (portable), 3 (extensible), > 4 (graphs), 7 (depth and spaces), 8 (gimp states) are a must. 5 > (recoverable) will be nice, a lot, but if you want it to work, it > sounds like some escaping and reserved flags will be needed (like line > code in transmissions). If the chunk recoverer finds what it thinks is a valid block, but the checksum doesn't match, then it will assume there is no valid block there. So escaping isn't really necessary. > I would forget 11 (compression), and put 10 (compact) as a secondary to > 9 (fast load/save) and 6 (fast access). I would add tile based as 12. Compression of image data is important, although not essential. 10 is definately less important than 6/9. > To some extent, it reminds me of the Blender format (with the add on > that Blender files are 64 or 32 bit, little or big endian, and all the > plataforms can load them fine... Adam will love it :] ) Joyful. :) Rockwalrus