On Mon, Apr 02, 2012 at 09:50:53PM +1000, elton sky wrote: > Hi Nguyen, > > Still have some questions on your idea: > > On Tuesday, March 27, 2012, Nguyen Thai Ngoc Duy wrote: > > >> A block starts with a signature (a tree block, or an extension...). A > > >> tree block consists of: > > >> > > >> - uint32_t tree object's size > > >> - sha-1 of tree object > > >> - crc32 of the rest of the block except tree object > > >> - maybe reference counter of a block can be refered by many blocks?? > > >> - tree object (i.e. something that tree-walk.c can parse) > > > > > > Do you mean each block contains a tree and all its blobs? So the tree > > > object here, effectively a dir, also contains files in the dir ? In > > > this way, some blocks can be very big. > > > > No, the tree object contains pathname, mode and SHA-1 of its entries, > > one level only (try "git ls-tree HEAD"). If an entry is a directory > > and we have not built it yet, we won't have its sha-1, so it will be > > zero (similar to invalid cache-tree). > > > > Correct me if I am wrong, I assume: > * Although you only listed attributes in tree block, in index, we have > both tree and blob block. In current index, tree is implied in path names. We only store blob sha-1. > * Sha1 of a tree block is computed by hashing all the Sha1s of blobs > and sub trees in the directory. > * Like cache tree struct, each tree object contains list of offsets to > children blocks. > > How do you store blob blocks ? Is each blob object a block? just like > a tree object is a block? If so, each blob contains a sha1, the > overhead is high? > > If we compute hash for a tree, and this tree happen to have 500 > children do I have to access all 500 blocks to get their Sha1s? We don't store blobs in index. We only need their sha-1, which is computed and content stored in object database at "git add". By the way, I revised my new index format a little bit, see the end of this email. It may work, or may not. Food for thoughts. > Also I ran some quick test on git-add over kernel 2.6. When I do "time > git add .":time git add . > > cmd_add: validate_pathspec takes : 0 ms > read_index_from: xmmap&close takes : 0 ms > read_index_from: verify_hdr takes : 26 ms > read_index_from: create inmem struct takes : 4 ms > read_index: read_index_from takes : 31 ms > read_directory: qsort takes : 0 ms > fill_directory: read_directory takes : 97 ms > cmd_add: prune dir takes : 0 ms > cmd_add: add_files_to_cache takes : 37 ms > cmd_add: add_files takes : 0 ms > > real 0m0.172s > user 0m0.120s > sys 0m0.050s > > And when I ran "time git add arch/ia64" : > > cmd_add: validate_pathspec takes : 0 ms > read_index_from: xmmap&close takes : 0 ms > read_index_from: verify_hdr takes : 20 ms > read_index_from: create inmem struct takes : 4 ms > read_index: read_index_from takes : 25 ms > read_directory: read_directory_recursive takes : 10 ms > read_directory: qsort takes : 0 ms > fill_directory: read_directory takes : 10 ms > cmd_add: fill_directory takes : 10 ms > cmd_add: prune dir takes : 0 ms > cmd_add: add_files_to_cache takes : 1 ms > > real 0m0.043s > user 0m0.040s > sys 0m0.000s > > In both cases, the time for sha1 is quite stable (~20ms). > fill_directory drops as I specify a sub directory, which make sense. > As Junio suggested, the sha1 time (verify_hdr) is a mix a read and > sha1. And this part is our focus to optimize, isn't it? I think so. But until we can read just parts of index, we still have to verify integrity for the whole index, which takes more or less the same amount of time you see and should be proportional to index size (or the number of entries in index). Either we shrink the index, or go with cheaper checksum, or both. > But, as growth > of the whole repo, the processing time is getting dominated by > fill_directory (if we use '.', it takes 97ms) rather than verify_hdr. {read,fill}_directory is not always used (for example, "git diff" does not need it). Meanwhile, as working directory grows, index size grows, verify_hdr() will take longer. > In current system, The time complexity for fill_directory is nlogn (n > is number of objects). git recursively go thru sub directories and > files in it and check against current index. When searching index, it > uses binary search which makes it lg(n). If this is the case, will use > a producer/consumer model help? I think fill_directory is dominated by kernel time (read_dir, stat...), there's little thing we can do there. Anyway I'm pretty sure fill_directory is out of scope. It's just one of the code that uses index. -- 8< -- GIT index format ================ This format replaces the old "DIRC" format. Compared to the old format, which is essentially a sorted list of pathnames, this one: - is tree-based - use crc32 as checksum - only verify integrity on parts that git accesses, instead of whole file - append changes to the end - allow index versioning Updates can be made directly to the index by appending to the end. The index traversed by locating the root tree block from the trailer. When a path is updated, all related tree blocks are updated and appended to the end, then a new trailer (with generation increased by one) is written to conclude the index. The index size will increase continuously. At some point, we will need to repack it. Let assume a tree block is 64k on average and a path generally consists of 3 path components. That means an entry update adds 192k and we can do about 80 updates before index reaches 16M (in addition to initial index size). At 16M or when trailer generation hits a limit (the limit can be configurable), we rewrite the index to reduce its size. Some heavy operations can also be used to rewrite index, such as checkout or reset. The index integrity is verified by crc32. One crc32 covers header and trailer. Each block has its own crc32. When the index is found corrupt, we could try to roll back to latest good version by looking for trailers from bottom up. Even when the index is not corrupt, users can still look back this way for older index versions. = The git index file has the following format - A 8-byte header consisting of 4-byte signature: The signature is { 'T', 'R', 'E', 'E' } 4-byte version number: The current supported versions are 1. - A number of blocks of variable size 1-byte block type 3-byte content size in byte block content 4-byte crc32 of all above - A 18-byte trailer consisting of 4-byte trailer signature: The signature is { 'R', 'O', 'O', 'T' } 2-byte generation: The first trailer is 0, the second 1 and so on. 4-byte root block offset 4-byte extension table offset: Zero means no extension 4-byte checksum: CRC32 of the header and the trailer (excluding this field) == Tree block A tree block contains a (maybe invalid) tree object and extra information of its companion in working directory. Tree block has block type 'T'. Tree block content is basically the list of non-recursive entries in specified path, with all attributes we store in the index now. There are a few changes though to intergrate cache-tree and allow bsearch() on mmap'd block. A tree block content consists of - 4-byte tree object size - 20-byte SHA-1 of the cached tree object - a list attributes corresponding to tree object's item, in the same order. These attributes are the same as in DIRC entry format except that entry name is removed, and a tree block offset is added in case the item is a directory. 32-bit ctime seconds, the last time a file's metadata changed this is stat(2) data 32-bit ctime nanosecond fractions this is stat(2) data 32-bit mtime seconds, the last time a file's data changed this is stat(2) data 32-bit mtime nanosecond fractions this is stat(2) data 32-bit dev this is stat(2) data 32-bit ino this is stat(2) data 32-bit mode, split into (high to low bits) 4-bit object type valid values in binary are 1000 (regular file), 1010 (symbolic link) and 1110 (gitlink) 3-bit unused 9-bit unix permission. Only 0755 and 0644 are valid for regular files. Symbolic links and gitlinks have value 0 in this field. 32-bit uid this is stat(2) data 32-bit gid this is stat(2) data 32-bit file size This is the on-disk size from stat(2), truncated to 32-bit. 160-bit SHA-1 for the represented object if blobs or the offset to another tree block if trees A 32-bit 'flags' field split into (high to low bits) 1-bit assume-valid flag 1-bit extended flag (must be zero in version 2) 2-bit stage (during merge) 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF is stored in this field. 1-bit skip-worktree flag (used by sparse checkout) 1-bit intent-to-add flag (used by "git add -N") 14-bit unused, must be zero A 16-bit offset, relative to the beginning of this block, to the pathname of this entry. FIXME: make it 32-bit, relative to the beginning of the file, so that we can reuse pathnames from other (old) blocks? - a list of NUL-terminated pathnames, pointed to from the 16-bit offset above. This list does not have to be of the same order as the attribute list. The reason this is separated from the attribute list is to make attribute list fixed size, searchable using bsearch(). == Extension table block Extension table has block type 'X'. It consists of a series of 4-byte extension block offset. == Extension block Extension block has block type 'E'. Extension content is the same as in the old format. -- 8< -- -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html