.... just realize I have to register and submit my proposal on gsoc website, rather than here. I stupidly missed that ... silly enough.. On Fri, Apr 6, 2012 at 1:15 PM, elton sky <eltonsky9404@xxxxxxxxx> wrote: > NOTE: > Please ignore the > >>- checkpoint : interleave with items in the block, 1 for every 100 items >> 4-byte offset to next checkpoint > > That's irrelevant. > > Cheers, > Elton > > On Fri, Apr 6, 2012 at 1:13 PM, elton sky <eltonsky9404@xxxxxxxxx> wrote: >> Thank you everyone for ideas, clues, explanations and questions. >> Collectively I wrote my proposal. This one is mostly based on Duy's >> suggestion with minor changes. >> >> >> Problem: >> Current index store all files in a list. This implies that: >> 1. Each operation has to read the whole index and write the whole index back. >> 2. computing the checksum for the whole index. >> These become expensive when the repo is large. >> >> Requirement: >> Suppose n is the number objects in a repo >> -Keep the read/write time <= O(logn). >> -Keep the checksum computed against only necessary objects. >> -New format is easy to parse. >> -Backward compatible. >> -Potentially use faster hash method. >> >> Proposed solution: >> >> Store the repo structure in a canonical tree. Each directory is a tree >> block. A tree block contains blobs and offsets to sub directories. It >> has its own checksum. A read/write will be done on tree block base. A >> tree block also contains an offset points to its previous version (if >> there's one). >> >> The root of offset is stored in trailer, which stays at the end of >> file. Each update creates a new trailer which points to the new tree. >> (details below) >> >> To save the pain of modify a tree block and track the free spaces in >> the index, changed blocks are appended at the end. Based on the >> assumption that user won't change too many files each time, for an >> updated file, all its parent blocks to Root was copied and appended to >> index. In other words, all traversed blocks are copied. The offset of >> previous version of the updated block is stored in the new block. A >> new generation number is stored in copied and updated blocks. Other >> blocks are not copied. They are referenced by offsets. After update a >> new trailer is created at the end. In this way, there's no harm to >> read, and makes write fast. >> >> Trailer stores the offset of previous trailer. It makes tracking old >> versions easy. >> >> Each operation will load and rewrite the header and visited trailer . >> >> Checksum for non identifier purpose will use crc32. Otherwise it uses sha1. >> >> For compatibility, old format of index will be transformed to new >> format in the first operation. >> >> == >> Index format: >> >> - A 8-byte header consisting of >> >> 4-byte signature: >> The signature is { 'T', 'R', 'E', 'E' } >> >> 4-byte version number: >> The current supported versions are 4. >> >> - A number of blocks of variable size >> >> 1-byte block type >> >> 3-byte content size in byte >> >> block content >> >> 4-byte crc32 of all above >> >> - A 20-byte trailer consisting of >> >> 4-byte trailer signature: >> The signature is { 'R', 'O', 'O', 'T' } >> >> 4-byte root block offset >> >> 4-byte extension table offset: >> Zero means no extension >> >> 4-byte offset to previous trailer >> >> 4-byte checksum: >> CRC32 of the header and the trailer (excluding this field) >> >> == >> Tree block: >> >> Tree block content is basically the list of entries in a specified >> path, with all attributes we store in the index now. This entry list >> is sorted by pathname. For doing a bsearch in the list, pathnames are >> stored at the end of block, which makes the size of entry fixed. The >> pathname is pointed from each entry with 2 byte offset (relative to a >> block). This should not be problem as a block is never modified. >> >> It stores the generation number and the offset to old block. It also >> integrates the content of cache-tree. >> >> A tree block content consists of >> >> - 4-byte tree object size >> >> - 20-byte SHA-1 of the cached tree object >> >> - checkpoint : interleave with items in the block, 1 for every 100 items >> 4-byte offset to next checkpoint >> >> - a list attributes corresponding to tree object's item, in the same >> order. These attributes are the same as in DIRC entry format >> except that entry name is removed, and a tree block offset is >> added in case the item is a directory. >> >> 32-bit ctime seconds, the last time a file's metadata changed >> this is stat(2) data >> >> 32-bit ctime nanosecond fractions >> this is stat(2) data >> >> 32-bit mtime seconds, the last time a file's data changed >> this is stat(2) data >> >> 32-bit mtime nanosecond fractions >> this is stat(2) data >> >> 32-bit dev >> this is stat(2) data >> >> 32-bit ino >> this is stat(2) data >> >> 32-bit mode, split into (high to low bits) >> >> 4-bit object type >> valid values in binary are 1000 (regular file), 1010 (symbolic link) >> and 1110 (gitlink) >> >> 3-bit unused >> >> 9-bit unix permission. Only 0755 and 0644 are valid for regular files. >> Symbolic links and gitlinks have value 0 in this field. >> >> 32-bit uid >> this is stat(2) data >> >> 32-bit gid >> this is stat(2) data >> >> 32-bit file size >> This is the on-disk size from stat(2), truncated to 32-bit. >> >> 160-bit SHA-1 for the represented object if blobs or the offset >> to another tree block if trees >> >> A 32-bit 'flags' field split into (high to low bits) >> >> 1-bit assume-valid flag >> >> 1-bit extended flag (must be zero in version 2) >> >> 2-bit stage (during merge) >> >> 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF >> is stored in this field. >> >> 1-bit skip-worktree flag (used by sparse checkout) >> >> 1-bit intent-to-add flag (used by "git add -N") >> >> 14-bit unused, must be zero >> >> A 16-bit offset, relative to the beginning of this block, to the >> pathname of this entry >> >> - 2-byte generation number, starts from 1 >> >> - 4-byte previous version offset >> >> - a list of NUL-terminated pathnames, pointed to from the 16-bit offset >> above >> >> == Extension table block >> >> Extension table has block type 'X'. It consists of a series of 4-byte >> extension block offset. >> >> == Extension block >> >> Extension block has block type 'E'. Extension content is the same as >> in the old format. >> >> >> >> Time line: >> >> 24/04 ~ 21/05: get familiar with code base and revise proposal >> benchmark with linux kernel on major operations >> write prototype to prove feasibility of proposed solution >> consult mailing list & irc >> >> 22/05 ~ 25/06 write code, test and benchmark >> modify tree lib >> modify index format operations >> modify git operations >> transform from old to new >> >> 26/06 ~ 30/07 revise things according to benchmark >> >> 31/07 ~ 13/08 update documentation >> >> >> About me >> >> My name is Elton Tian, I am from China. I have been living in >> Australia for quite a few years. I am currently a Master student from >> Australia National University. After graduate I worked on linux based >> web development (using tcl) for 2.5 years. I have been programming >> with c, c#, java, tcl, php, javascript and shell script. But I prefer >> c, which gives me the feeling of full control over the program. I am >> interested in data intensive computing. I played with hadoop and >> mapreduce since 2010. I maintained my 3 node cluster behind my desk. >> As linus hates cvs with a passion, I still want to mention I was using >> cvs in work, don't like it though. And I guess this is the good chance >> to get over it. >> >> On Thu, Apr 5, 2012 at 2:22 AM, elton sky <eltonsky9404@xxxxxxxxx> wrote: >>> Hello, >>> >>> Some updates for Nguyen's index: >>> >>> On Wed, Apr 4, 2012 at 10:20 PM, Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> wrote: >>>> On Wed, Apr 4, 2012 at 3:26 PM, elton sky <eltonsky9404@xxxxxxxxx> wrote: >>>>> I am not sure how the trailer works. >>>>> I assume there can be multiple trailers, each update will generate a >>>>> new one. Every trailer will point to the root tree (i.e. all trailers >>>>> point to the same block?). So if there are some changes to root, like >>>>> rename, trailers all point to the latest root block? >>>> >>>> Each trailer points to the whole new tree. Because trees are >>>> immutable, changing in a tree meangs creating a new one and will also >>>> make a new parent tree (to point to the updated tree because old >>>> parent will always point to old tree). This eventually leads to root >>>> tree change, recorded by the trailer. >>>> >>>>> Is the index looks like : >>>>> | HEADER | TREE BLOCKS | TRAILER | TREE BLOCKS | TRAILER | TREE >>>>> BLOCKS | TRAILER | ... >>>>> >>> >>> Once an update happened to a block, all parent blocks to root will be >>> copied to the end of the index. And the updated block will be at leaf >>> of the new tree. Finding this path costs logn time anyway. This is no >>> harm for read for the whole tree. But in order to find all previous >>> changes to a block, we have to go through all trailers and trees. >>> >>> Otherwise, just modify the original tree. Let the parent points to the >>> updated block and let the updated block points to the old block: >>> >>> parent >>> | >>> V >>> updated old old >>> block (v3) --> block(v2) --> block (v1) >>> | >>> V >>> child >>> blocks >>> >>> A version number in a tree block is used to track the changes. >>> In this way, there's still no harm to read, and it's more easy to >>> trace the change history of a block. Also, we don't need to create >>> interleaved blocks and trailers. There's only one trailer in the end >>> of file. We also need to add another offset points to previous >>> version. >>> >>> Trailer is kept at the end of index, as its size is variable. It >>> contains offset to root and list of free spaces. >>> >>> Changes to format: >>> >>>> = The git index file has the following format >>> >>> As is, except there's only one trailer now. And trailer contains list >>> of free spaces. >>> >>>>- A 18-byte trailer consisting of >>>> >>>> 4-byte trailer signature: >>>> The signature is { 'R', 'O', 'O', 'T' } >>>> >>>> 2-byte generation: >>>> The first trailer is 0, the second 1 and so on. >>>> >>>> 4-byte root block offset >>>> >>>> 4-byte extension table offset: >>>> Zero means no extension >>>> >>> list of free spaces >>> - 4 byte offset >>> - 2 byte length >>> >>>> 4-byte checksum: >>>> CRC32 of the header and the trailer (excluding this field) >>> >>> Free space list is read/written in whole for each operation, together >>> with trailer. >>> >>>> >>>> == Tree block >>>> ... >>> >>> Above as is. >>> >>> - 1 byte version num >>> >>> - 4 byte offset to previous version block >>> >>>> >>>> 160-bit SHA-1 for the represented object if blobs or the offset >>>> to another tree block if trees >>>> >>>> A 32-bit 'flags' field split into (high to low bits) >>>> >>>> 1-bit assume-valid flag >>>> >>>> 1-bit extended flag (must be zero in version 2) >>>> >>>> 2-bit stage (during merge) >>>> >>>> 12-bit name length if the length is less than 0xFFF; otherwise 0xFFF >>>> is stored in this field. >>>> >>>> 1-bit skip-worktree flag (used by sparse checkout) >>>> >>>> 1-bit intent-to-add flag (used by "git add -N") >>>> >>>> 14-bit unused, must be zero >>>> >>>> A 16-bit offset, relative to the beginning of this block, to the >>>> pathname of this entry. FIXME: make it 32-bit, relative to the >>>> beginning of the file, so that we can reuse pathnames from other >>>> (old) blocks? >>>> >>> >>> -Elton -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html