On Fri, May 4, 2012 at 12:25 AM, Thomas Gummerer <t.gummerer@xxxxxxxxx> wrote: > == Directory entry > > Directory entries are sorted in lexicographic order by the name > of their path starting with the root. > > Path names (variable length) relative to top level directory (without the > leading slash). '/' is used as path separator. '.' indicates the root > directory. The special patch components ".." and ".git" (without quotes) > are disallowed. Trailing slash is also disallowed. > > 1 nul byte to terminate the path. > > 32-bit offset to the first file of a directory > > 32-bit offset to conflicted/resolved data at the end of the index. > 0 if there is no such data. [4] If it's non-zero, how do we know how many conflict entries we have? > 4-byte number of subtrees this tree has let's name this nr_subtrees > 4-byte number of entries in the index that is covered by the tree this > entry represents. (entry_count) (-1 if the entry is invalid) and this nr_entries. So how do we know how many entries (including all dirs, files, staged files) this directory has? I assume if enry_count != -1, the number would be nr_subtrees + nr_entries (or just nr_entries, depending on your definition). When entry_count == -1, how do we calculate this number? > 160-bit object name for the object that would result from writing > this span of index as a tree. > > The last 24 bytes are for the cache tree. An entry can be in an > invalidated state which is represented by having -1 in the entry_count > field. If an entry is in invalidated state, the next entry will begin > after the number of subtrees, and the 160-bit object name is dropped. > > The entries are written out in the top-down, depth-first order. The > first entry represents the root level of the repository, followed by > the first subtree - let's call it A - of the root level, followed by > the first subtree of A, ... Assume the command is "git diff -- path/to/h*", we don't need full index, just stuff in "path/to/h*" from the index. I'm trying to see how to load just those paths from index, not full index. I assume again that you won't invent a new function and use tree_entry_interesting() to do tree pruning while loading index. t_e_i() is designed to read tree objects. But I think we can make it read on-disk directory/file entries with a few small changes. t_e_i() is recursive and fits quite well with depth-first directory layout in the proposed index format. I have difficulties figuring out how you skip subtrees though. Assume we are at "path" and we are not interested in anything there until we meet "path/to", how do you skip subtrees "path/abc" and "path/def"? Processing directory entries sequentially will eventually get us to "path/to", but that could be a lot of entries if "path/abc" is deep. A file offset pointer to the next sibling directory entry might help. Does such a pointer exist but I did not see it, or you have other means to do this? Also the file/dir separation makes it more difficult to match the last "h*" part, if there are both "here" directory and "howto" file. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html