> On Wed, Feb 06, 2002 at 02:17:21PM +0100, Raphael Quinet <quinet@xxxxxxxxxx> wrote:
> > EXIF data and simply copy the descriptions given in the EXIF standard.
> > Some of the fields will have to be discarded (or set read-only or not
> > persistent) because they only make sense for the original file format
>
> It might hurt, but I think the best thing is to not attach these values at
> all, as there is no semantics attached to them until the core recognizes
> and modifies them properly on edits (at least for most values).
A large part of the EXIF data can be useful for other plug-ins, so it would use the "gimp-*" namespace for the parasites the semantics of each parasite would be known.
Other parts of the EXIF data are less interesting, usually because they describe some properties of the file format that would be changed when the image is saved into a new file (for example, "RowsPerStrip", "SamplesPerPixel", "JPEGInterchangeFormatLength"). I think that we should use non-persistent parasites for these parts. The data would not be saved in a new file (unless it is reconstructed by the JPEG/EXIF plug-in) but it would be available after the image has been loaded.
Even if the core does not make use of this information, it could still be displayed to the user by a "File->Properties" plug-in. The dialog window created by this plug-in could provide several tabs that are specific to some file formats. There would be at least one or two tabs for the EXIF data, even for the non-persistent parts. I think that it would be very interesting for the user to know a bit more about the properties of the original file, even if these properties are lost after the conversion to a flat RGB bitmap. Even tools such as ImageMagick's identify do not always provide the information that I am looking for, although I know that it is loaded and parsed by the application.
> The biggest problem is the format inside parasites - i personally dislike > gserialize (unused anyway), and strongly favour very simple decomposing, > i.e. scalars ("single-valued string-thingy") where possible. So the more > parasites the better ;)
I fully agree. All pieces of metadata can be decomposed to strings or single numbers. Blocks of raw data should also be allowed for special cases such as ICC color profiles, but only if these blocks are following a well-specified format (including endianness and other things that could cause problems on different platforms).
As Sven has already suggested, all text strings should use UTF-8 encoding. It would be up to each plug-in to convert whatever is appropriate for each file format to/from UTF-8 if necessary. Currently, even the usage of "gimp-comment" causes some problems because some file formats (such as PNG) require ISO-8859-1 (Latin-1) for the character set, some say that the comment should be encoded in the user's current character set (how do you exchange files with others, then?), some others require 7-bit ASCII and most of them do not specify anything.
These strings should not have any constraints regarding their length or format (single line or multiple lines of text). It should be up to each plug-in to do the appropriate conversions if necessary.
On Wed, 06 Feb 2002, Sven Neumann <sven@xxxxxxxx> wrote: > > And the most natural place for this is parasites, [...] > > exactly. If there's a need to improve the current parasites, let's do > that now. I could imagine that a more hierachical structure might > help, but I'd like to see a real usage case before we consider doing > such a change. Is the EXIF data such a usage case?
No, all parts of the EXIF data can be stored in a flat list without losing any information. There are some sub-structures such as the GPSInfo, but that can easily be flattened as long as all parasites start with the same prefix ("gimp-gps-latitude", "gimp-gps-longitude", "gimp-gps-img-direction" and so on).
I had a look at some other file formats that can contain significant amounts of metadata (PNG, TIFF, TGA) and I did not see anything that would require more hierarchical structures. This is not completely true because TIFF can be arbitrarily complex, but we will probably never try to cover all sub-formats of the TIFF "standard".
-Raphael