Re: GSoC - Designing a faster index format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 26, 2012 at 9:28 PM, Thomas Rast <trast@xxxxxxxxxxxxxxx> wrote:
> elton sky <eltonsky9404@xxxxxxxxx> writes:
>
>> On Mon, Mar 26, 2012 at 12:06 PM, Nguyen Thai Ngoc Duy
>> <pclouds@xxxxxxxxx> wrote:
>>> (I think this should be on git@vger as there are many experienced devs there)
>>>
>>> On Sun, Mar 25, 2012 at 11:13 AM, elton sky <eltonsky9404@xxxxxxxxx> wrote:
>>>> About the new format:
>>>>
>>>> The index is a single file. Entries in the index still stored
>>>> sequentially as old format. The difference is they are grouped into
>>>> blocks. A block contains many entries and they are ordered by names.
>>>> Blocks are also ordered by the name of the first entry. Each block
>>>> contains a sha1 for entries in it.
>>>
>>> If I remove an entry in the first block, because blocks are of fixed
>>> size, you would need to shift all entries up by one, thus update all
>>> blocks?
>>
>> We need some GC here. I am not moving all blocks. Rather I would
>> consider merge or recycle the block. In a simple case if a block
>> becomes empty, I ll change the offset of new block in the header point
>> to this block, and make this block points to the original offset of
>> new block. In this way, I keep the list of empty blocks I can reuse.
> [...]
>
> Doesn't that venture into database land?
>
> If we go that far, wouldn't it be better to use a proper database
> library?  All other things being equal, writing such complex code from
> scratch is probably not a good idea.

If there's a library that fits our needs (including linking
statically). I think we've come close to sqlite file format [1]. But
sqlite comes with sql engine, transactional updates... that we don't
need. Another obvious source for inspiration is file systems, but I
dare not go that way.

[1] http://www.sqlite.org/fileformat2.html
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]