Peter Eriksen <s022018@xxxxxxxxxxxxxx> wrote: > I have been reading the commits in the > git://repo.or.cz/git/fastimport.git/ repository (git makes it quite easy > to see what differs from mainline using "git log master..pack4"), and I > think, I have understood some of the details. Just to be clear, that branch is strictly a proposed prototype of what a pack version 4 *might* look like. Absolutely nothing has been set into stone for that file format. A good chunk of that code needs to be reworked just to get it merged onto Junio's current 'master' as Nico and myself have been doing a number of cleanups and bug fixes in some of the affected areas. ;-) > The easiest thing to get was the file name table, which is placed in the > beginning of the pack (after the header) using the format: That's not true. The filename table (EXTOBJ_FILENAME_TABLE) may appear at any position within the packfile (but like all objects it must appear somewhere after position 12, as that is where the header ends). Now to help out the unpackers (index-pack and unpack-objects) we have the convention that this table is written out before the first OBJ_DICT_TREE. That way the unpacker can load the table and have it ready to go when it sees the first OBJ_DICT_TREE. If we didn't have this rule the unpackers would need to hang into all OBJ_DICT_TREEs they see until they get the EXTOBJ_FILENAME_TABLE, then they could actually process those pending OBJ_DICT_TREEs. This is somewhat expensive on memory, and is just ugly to code. So what you will find is that the EXTOBJ_FILENAME_TABLE is dumped out behind all of the commits, but before the first OBJ_DICT_TREE, and since all trees tend to get converted to an OBJ_DICT_TREE, the EXTOBJ_FILENAME_TABLE is sandwiched exactly between the commits and the trees. Since the unpackers will probably never be smart enough to handle an OBJ_DICT_TREE before an EXTOBJ_FILENAME_TABLE, its likely that we'll just have the file format requirement that the EXTOBJ_FILENAME_TABLE must appear before the first OBJ_DICT_TREE, but can otherwise appear at any position in the file. The reason we put the EXTOBJ_FILENAME_TABLE behind the commits is we often walk the commit chains (following parent pointers) without looking at the trees at all. Consider `git log`, in the default settings we don't need the trees. By keeping the filename table behind the commits the OS read-ahead buffering gets a better chance at loading all of the data we need, and none of the data we don't. So that's why its where it is. > +------------+-------------------------------+ > | NR_ENTRIES | Compressed file name table | > +------------+-------------------------------+ > 4 bytes No. A string table object (both the EXTOBJ_FILENAME_TABLE and the EXTOBJ_IDENT_TABLE) has its uncompressed size stored in the standard "size" field within the object header. (This lets us malloc the proper buffer quickly.) Immediately behind that object header is the deflated table. The deflated table looks like: +------------+-------+-------+-------+--------+----+ | NR_ENTRIES | MODE1 | str1 | MODE2 | str2 | ... +------------+-------+-------+-------+--------+----+ 4 bytes 2 bytes n1 2 bytes n2 The field NR_ENTRIES is in big-endian (network) byte order. Each MODE field is also in big-endian byte order. Each string is null terminated. The lengths n1 and n2 in the diagram above would include the null terminating byte. There is no end-of-table marker; the way to know the you reach the end of a table is by counting NR_ENTRIES records out. I did consider making NR_ENTRIES a vint (variable length int), but decided against it for the sake of simplicity. ;-) For starters its much easier to just treat the darn thing as a 32 bit value and use ntohl. Its also easier in the pack-objects code, as I can reserve that space at the front of the table, as the size is fixed. Also, since its is actually inside of the deflated zlib stream, and null bytes are very common (string terminaters) any unnecessary leading nulls will probably compress quite well, as the null byte will probably get a relatively short encoding in the compressed stream. The MODE fields are the standard POSIX mode bits in an EXTOBJ_FILENAME_TABLE. In an EXTOBJ_IDENT_TABLE the MODE fields actually store the preferred timezone offset (hours in the first/high byte, minutes in the second/low byte) of the user whose name/email is stored in the string field. > The table is sorted by path then mode for easy binary lookup, and so > that pointers into this table can be compared directly instead of > comparing the corresponding paths and modes. Uh, not quite. In the case of EXTOBJ_FILENAME_TABLE we sort by name+type using the messy base_name_compare. In this sorting string entries whose mode match S_ISDIR (are directory modes) sort as though their name ends with "/" (even though they actually don't). If there is a tie, we break the tie by sorting by the mode alone. In the case of EXTOBJ_IDENT_TABLE we plan to sort by frequency of occurance only. This sorting puts the most frequent users at the start of the table, allowing us to reference the top 128 authors and committers in just 1 byte, and the next 16,257 top authors and committers in just 2 bytes (as we use vints to index into here, more later). > There is a new tree type called OBJ_DICT_TREE, which looks something > like the following: > > +-----------------+------------------------------------------------+---- > | Table offset | SHA-1 of the blob corresponding to the path. | ... > +-----------------+------------------------------------------------+---- > 6 bytes 20 bytes No. As Nico stated the records of an OBJ_DICT_TREE are actually only 6 bytes each. Actually an OBJ_DICT_TREE is *not* comprssed in the packfile. I want to stress this point, as its unlike most other object types where the data after the header is just a zlib stream. Its data looks like: +------------+-------+-------+-------+-------+---- | NR_ENTRIES | name1 | hash1 | name2 | hash2 | ... +------------+-------+-------+-------+-------+---- vint 2 bytes 4 bytes 2 bytes 4 bytes The NR_ENTRIES field is our "standard" variable integer encoding (the encoding used by OBJ_OFS_DELTA). It tells us how many tree entries to expect. name1 is an index into the packfile's sole EXTOBJ_FILENAME_TABLE. hash1 is an index into the packfile's sole SHA1 table. This object type hasn't been declared yet, but will be. Both fields are in big-endian / network byte order. > These new tree objects will remain uncompressed in the pack file, but > sorted with, Yes, correct. > and deltaed against other tree objects. *only* against other OBJ_DICT_TREEs. If a tree could not be converted to an OBJ_DICT_TREE then it stays as an OBJ_TREE and only deltas against other OBJ_TREEs. > All normal tree > objects are converted to OBJ_DICT_TREE when packing, Almost. We try to convert all trees to OBJ_DICT_TREE when packing, but we cannot do so if the EXTOBJ_FILENAME_TABLE does not contain one or more path/mode pairs required by that tree. This can happen if the EXTOBJ_FILENAME_TABLE would need to contain more than 2**16 entries, as the index into that table (name1 above) is strictly a 16 bit unsigned value. Thus we have a rule in pack-objects where we first sort the EXTOBJ_FILENAME_TABLE by frequency, clipping it to the top 2**16 entries, then we resort it according to the name+mode sort. > and are converted > back on the fly to callers who need an ordinary OBJ_TREE. Yes. But we don't want to actually do that. One of our goals is to adjust tree-walk.c (and if needed its callers) to directly handle an OBJ_DICT_TREE. This way we can avoid a lot of costly decompression. Further I think we can play a game with the delta encoder and delta apply routines where we can even avoid applying OBJ_DICT_TREE deltas when we are walking the tree; instead we can walk the deltas directly. This is the primary motiviation for keeping the OBJ_DICT_TREE format a fixed width record, even if it might waste a tiny amount of space for some projects. > The index (.idx) files are extended to have a 4 byte pointer to the > offset of this file name table in the pack file for easy lookup. Yes. But these may become 64 bit offsets, to allow for very large packfiles. > There is something similar with a table of common strings in commit > objects (e.g. author and timezone), and a new object OBJ_DICT_COMMIT, > but I have not understood that quite yet. Its actually EXTOBJ_DICT_COMMIT. The idea here is that author and committer strings appear very commonly thoughout a project. Look at Junio for example in git.git, there are more than 3,000 commits with his name on them. These compress rather poorly, and don't delta against each other very well at all. By pulling these common strings out to an EXTOBJ_IDENT_TABLE we can save some space. The other idea is to store the tree and the parent commits in pure binary (so SHA-1s are 20 bytes, not 40 bytes hex) and to avoid text headers, so that we can parse the important fields of a commit that are needed for revision walking immediately from the raw pack data. Since SHA-1s are uncompressable we aren't actually losing any disk space here either. Actually in my early experiements (predates the packv4 code you looked at) this was saving about 63 bytes per commit. > Is there something, I have gotten wrong with regards to my > understanding? You're close. Not bad for no documentation! ;-) -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html