Re: [GSoC] Designing a faster index format

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 3, 2012 at 3:51 PM, Michael Haggerty <mhagger@xxxxxxxxxxxx> wrote:
>> -- Proposed solution --
>> The proposed solution is to redesign the index to a B-tree based format. This
>> allows changes to the index in O(log(n)) time, with n being the number of
>> entries in the index.
>
> I thought that the index lock currently only blocks writers, not readers
> (am I wrong?).

I think you are right.

> So given that you want to be able to mutate the index
> file without rewriting the whole file, it seems to me that you have to
> pick from one of these alternatives:
>
> 1. Change the locking semantics so that readers also block when the file
> is locked.  This choice would have some important consequences: (a)
> readers will also have to obtain a lock before starting to read.  (b) to
> avoid deadlock, it will become crucial that the lock is never held
> across the execution of any other git (sub-)commands that might want to
> read the index.
>
> 2. Implement a file format that can be read even while it is being
> mutated.  If so, please explain the data file format in more detail; in
> particular, how do you plan to mutate the file in a way that does not
> disturb readers?  How do you plan to read the whole index efficiently (I
> imagine that reading the whole index will remain a frequent operation)?

Copy-on-write B-trees, aka btrfs. Does anybody like to go that route ;)

> I encourage you to include an analysis of the number of disk seeks when
> you are analyzing the cost of read/write operations on the index.  This
> will have a strong effect on the time for working with the index when
> the disk cache is cold.  The current index requires O(1) seeks for
> reading and writing, which I believe is a big part of the reason that
> the current read-the-whole-index/write-the-whole-index design performs
> excellently despite the amount of data that it is touching.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]