Re: [PATCH 00/32] Split index mode for very large indexes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Duy Nguyen <pclouds@xxxxxxxxx> writes:

> On Mon, Apr 28, 2014 at 02:18:44PM -0700, Shawn Pearce wrote:
>> > The read penalty is not addressed here, so I still pay 14MB hashing
>> > cost. But that's an easy problem. We could cache the validated index
>> > in a daemon. Whenever git needs to load an index, it pokes the daemon.
>> > The daemon verifies that the on-disk index still has the same
>> > signature, then sends the in-mem index to git. When git updates the
>> > index, it pokes the daemon again to update in-mem index. Next time git
>> > reads the index, it does not have to pay I/O cost any more (actually
>> > it does but the cost is hidden away when you do not have to read it
>> > yet).
>> 
>> If we are going this far, maybe it is worthwhile building a mmap()
>> region the daemon exports to the git client that holds the "in memory"
>> format of the index. Clients would mmap this PROT_READ, MAP_PRIVATE
>> and can then quickly access the base file information without doing
>> further validation, or copying the large(ish) data over a pipe.
>
> The below patch implements such a daemon to cache the index. It takes
> 91ms and 377ms to load a 25MB index with and without the daemon. I use
> share memory instead of pipe, but the format is still "on disk" not
> "in memory" for simplicity. I think we're good even without in memory
> format.

Interesting ;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]