On Tue, 04 Dec 2001, Dave Neary wrote: > So I've been pretty quiet for several months now, and I recently > nailed my flag to the mast, picked somethign from the TODO and > started working on it. It's the Image metadata object item, which > grew out of a desire to get the data out of the EXIF image format > used in digital cameras and view it.
Well, it is good that you posted that information to the list, because I also started working on that problem although I have a slightly different point of view. Instead of considering only the EXIF format, I decided to investigate the more general problems of storage and exchange of metadata in various image formats and also how to allow the user to edit this information in the Gimp.
Some time ago, I submitted two bug reports about this: http://bugzilla.gnome.org/show_bug.cgi?id=56443 (EXIF and metadata) http://bugzilla.gnome.org/show_bug.cgi?id=61499 (editing metadata)
I started to work on this, mostly by reading the specs of the various image formats that are currently supported by the Gimp and checking the type of metadata supported by each of them (and requirements on size, character set, and so on). I did not announce it because (like you) I can only spend a few hours per week on the Gimp besides the time that I already spend on the bug database and on the web site.
But you took the right decision: discussing it on this list is the best way to avoid duplication of effort or wasting time by going in the wrong direction.
(Side note: I cannot access #gimp on IRC because of a firewall on my main Internet access. I assume that I am not the only one. That's why I think that all important things should be discussed on this list and not on IRC. Also, the archives of this list are publicly accessible and show up in search engines, which is nice for the part-time contributors who are not subscribed to this list.)
> I plan to write some kind of GimPImageMetadata object which will > expand on the stuff that EXIF can hold, and to have the ability > to create/edit/save the metadata separate to/along with the image > (kind of like a thumbnail) where the image format doesn't support > metadata, or in the image where it does. I reckon I'll probably > work on that over the course of the next month or so (I only > really get a few hours gimptime a week). > > As a matter of interest, does anyone have any ideas beyond "use a > parasite, or a number of parasites" on how I can pass the data > between the gimp and the plugins? All the image save and load > functions should be able to see it, whether they use it or not is > up to them. Also, beyond the data fields supported by EXIF, are > there other metadata fields that people would like to see?
As I mentioned in bug #56443, I suggest that you read the discussion about EXIF on the PNG and MNG tools page (http://pmt.sourceforge.net/) The "Rationale" section of the proposal explains why the metadata should not be stored as one big chunk, but instead should be split in individual values and each value should be interpreted, converted or discarded as appropriate. Since the Gimp is an image manipulation program, one can expect that the image data will be modified by the user. I have the impression that you are trying to preserve the whole EXIF metadata when saving the image, which is probably not a good idea because some parts of if may not be valid anymore after the image has been modified (or simply saved in a different format).
That's why I think that the first thing to do would be to define a list of "standard" image parasites in the Gimp, with precise definitions of the data types and constraints (so that's why I started studying the specs of the various image formats). Once this list is established and approved by everybody, this can be implemented in the load/save plug-ins. There is already a document in the source tree describing some parasites (devel-docs/parasites.txt) so we could expand it and add more information about the constraints (valid ranges and so on) for each parasite.
For example, the parasite "gimp-comment" is used to store the image comment and should use UTF-8 encoding. Some image formats such as GIF have only one image comment field (in 7-bit ASCII) while EXIF (JPEG) and TIFF/EP have separate fields for author name, copyright, image description, user comments and so on. PNG supports several text chunks of variable length, while other formats such as TGA have strange limitations (the author name is limited to 40 characters and the comments are limited to 4 lines of 80 characters). The best way to handle the conversions between the different image formats is to store everything in UTF-8 and apply the appropriate conversions when saving/loading. But in order to avoid any ambiguity, it is necessary to define the parasites in a very precise way so that all plug-ins will use the same fields for the same purpose. Maybe we even need some meta-meta-information saying for example if we are sure about the character set that was used in the original image comment or if this is only a best guess. I hope to have a proposal ready soon, but I cannot say if this will be next week or next year...
-Raphael