On Tue, 04 Dec 2001, Dave Neary wrote: > On Tue, Dec 04, 2001 at 03:32:18PM +0100, Raphael Quinet wrote: > > Some time ago, I submitted two bug reports about this: > > http://bugzilla.gnome.org/show_bug.cgi?id=56443 (EXIF and metadata) > > http://bugzilla.gnome.org/show_bug.cgi?id=61499 (editing metadata) > > I saw and read these (and the standards you pointed to), and > understood them to mean "I've found this information, but don't > have the time to work on them". That, and the facyt that the TODO > item was unassigned, convinced me that ti was up for grabs :)
That's right. Since I did not (and still do not) know how much time I will be able to spend on that, I did not want to have this task "officially" assigned to me because that could create expectations that I might not be able to fulfill. But I started working on it anyway.
> > As I mentioned in bug #56443, I suggest that you read the discussion > > about EXIF on the PNG and MNG tools page (http://pmt.sourceforge.net/) > > The "Rationale" section of the proposal explains why the metadata > > should not be stored as one big chunk, but instead should be split in > > individual values and each value should be interpreted, converted or > > discarded as appropriate. [...] > > That has been thought of, and I don't think that one metadata > structure rules that out. In a way, it's just one bucket in which > we store the various pieces of information. Of course each of > them would be individually modifiable, and would be modified as a > matter of course during the manipulation of the image (a good > example would be the width & height parameters).
If there is only one metadata structure, then the Gimp and the plug-ins will have to use the same structure. I would like to avoid that and decouple the plug-ins from the core as much as possible. This means that it should be possible to run a plug-in that supports new metadata tags with a version of the Gimp that uses an older version of the structure. If the new tag is present in the image file, then the plug-in should be able to load it from the file, attach it to the image and later save it to the file even if the Gimp does not know what it is. Or vice-versa: it should be possible to use a new version of the Gimp without having to recompile or relink the plug-ins. Using parasites has the advantage that you do not need to modify anything in the core: it only has to be done in the plug-ins (following the documentation that would be distributed with the Gimp).
> > That's why I think that the first thing to do would be to define a > > list of "standard" image parasites in the Gimp, with precise > > definitions of the data types and constraints (so that's why I started > > studying the specs of the various image formats). Once this list is > > established and approved by everybody, this can be implemented in the > > load/save plug-ins. There is already a document in the source tree > > describing some parasites (devel-docs/parasites.txt) so we could > > expand it and add more information about the constraints (valid ranges > > and so on) for each parasite. > > I have a problem with creating potentially dozens of parasites and > attaching them to images (possibly on a case by case basis, with some > parasites not getting attached for some image types), when we could > attach one object instead.
My goal is to allow the Gimp to attach the appropriate metadata to the files that have a format that supports this information -- not more, not less. This means that GIF will only include a 7-bit ASCII comment limited to chunks of 255 bytes (without saying it if is a copyright message or document name or author name or whatever), while JPEG/EXIF will have many more features but maybe not the same as the ones supported by TIFF/EP or the proposed extensions to PNG. I do not want to attach extra information to the files if this information was not planned in the specification for that file format. If the spec includes only a limited number of fields, then the Gimp should only save those and drop the other ones because other programs reading or writing these images would ignore or even choke on the non-standard information added by the Gimp. So I do not want to be able to hide EXIF information in a TGA file using extension fields (although this would be possible). I simply want to be able to support the native metadata for each file format in the best possible way so that the Gimp could easily exchange this information with other programs.
This means that we will loose some information when converting from one file format to another, but this is IMHO a better solution. Think about what would happen if the Gimp would include some additional metadata in a proprietary field added to some image format. If you modify and then save this file using another program, you will in the best case loose that information and in the worst case the program will copy this unknown metadata back into the new file and it will not match the new contents of the file.
> My personal opinion is that there's no need to tie ourselves down > to a host of parasites when we can come up with one, hopefully > clean, image data structure which will hold everything and will > get passed around with the image, and modified when it needs to > be accordingly. If we allow for the structure to be dumped > independent of the image too (in xml, or some other friendly > format), then we will have a consistent metadata approach, and > every image may have a variety of (edittable) data fields > associated with it. In addition, once we make the structure > sufficiently complete, the various formats that support extra > comments (such as exif/png) can write the bits they want into the > images saved in the plug-in.
See my comment above about decoupling the core and the plug-ins. Besides, the parasites are not that bad. They are reasonably compact (in memory and in the XCF file) and easy to access from a plug-in. I think that it is nice for a plug-in to be able to fetch or set only the "gimp-comment" parasite without having to know anything about the other parasites that may be set on the image.
Also, parasites are already supported by the current XCF file format so if we decide to standardize the names and types of the parasites that are useful for EXIF metadata, then they can be easily integrated in the plug-ins for gimp-1.2.x.
> Of course, I'm not saying "my way or the highway", but between > the two approaches (numerous parasites tailored to the format, > and one universal metadata object, parts of which are relevant to > each format, which can also be saved separate from the image), I > favour the latter.
I favour the former, but this is probably because we have different goals in mind about this metadata: - I would like to make sure that the Gimp can exchange as much information as possible with other programs (using the metadata that is supported natively by each file format -- maybe at the cost of loosing some information). - If I have understood you correctly, your goal is to preserve the metadata regardless of the file format being used and you want to ensure that all load/save operations in the Gimp will keep this information intact (but maybe not if other programs are used).
We also have different ideas in mind regarding the implementation: - I would like to implement this without requiring any changes in the core. The plug-in authors would only have to export/import the corresponding parasites in the file and would not depend on any changes done in the core or in other plug-ins, so this can already be done for the current stable version of the Gimp. - You would like to implement this in the core by adding a new metadata structure that is attached to the image.
So it is not surprising that we disagree. I think that I am right and you think that you are right. ;-) Does anybody else have an opinion on this?
-Raphael