On Mon, Mar 26, 2012 at 9:28 PM, Thomas Rast <trast@xxxxxxxxxxxxxxx> wrote: > elton sky <eltonsky9404@xxxxxxxxx> writes: > >> On Mon, Mar 26, 2012 at 12:06 PM, Nguyen Thai Ngoc Duy >> <pclouds@xxxxxxxxx> wrote: >>> (I think this should be on git@vger as there are many experienced devs there) >>> >>> On Sun, Mar 25, 2012 at 11:13 AM, elton sky <eltonsky9404@xxxxxxxxx> wrote: >>>> About the new format: >>>> >>>> The index is a single file. Entries in the index still stored >>>> sequentially as old format. The difference is they are grouped into >>>> blocks. A block contains many entries and they are ordered by names. >>>> Blocks are also ordered by the name of the first entry. Each block >>>> contains a sha1 for entries in it. >>> >>> If I remove an entry in the first block, because blocks are of fixed >>> size, you would need to shift all entries up by one, thus update all >>> blocks? >> >> We need some GC here. I am not moving all blocks. Rather I would >> consider merge or recycle the block. In a simple case if a block >> becomes empty, I ll change the offset of new block in the header point >> to this block, and make this block points to the original offset of >> new block. In this way, I keep the list of empty blocks I can reuse. > [...] > > Doesn't that venture into database land? > > If we go that far, wouldn't it be better to use a proper database > library? All other things being equal, writing such complex code from > scratch is probably not a good idea. If there's a library that fits our needs (including linking statically). I think we've come close to sqlite file format [1]. But sqlite comes with sql engine, transactional updates... that we don't need. Another obvious source for inspiration is file systems, but I dare not go that way. [1] http://www.sqlite.org/fileformat2.html -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html