Re: [POC PATCH 5/5] sha1_file: make the pack machinery thread-safe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Nguyen Thai Ngoc Duy <pclouds@xxxxxxxxx> writes:

> On Fri, Dec 9, 2011 at 3:39 PM, Thomas Rast <trast@xxxxxxxxxxxxxxx> wrote:
>> More precisely speaking, this pushes the locking down from
>> read_object() into bits of the pack machinery that cannot (yet) run in
>> parallel.
>>
>> There are several hacks here:
>>
>> a) prepare_packed_git() must be called before any parallel accesses
>>   happen.  It now unconditionally opens and maps all index files.
>>
>> b) similarly, prepare_replace_object() must be called before any
>>   parallel read_sha1_file() happens
>>
>> This simplification lets us avoid locking outright to guard the index
>> accesses; locking is then mainly required for open_packed_git(),
>> [un]use_pack(), and such.
>>
>> The ultimate goal would of course be to let at least _some_ pack
>> accesses happen without any locking whatsoever.  But grep already
>> benefits from it with a nice speed boost on non-worktree greps.
>
> (I'm running into multithread pack access problem in rev-list..)
>
> Why not put the global pointer "struct packed_git *packed_git" to
> "struct pack_context" and avoid locking entirely? Resource usage is
> like we run <n> different processes, I think, which is not too bad. We
> may want to share a few static pack_* variables such as
> pack_open_fds.. to avoid hitting system limits too fast.

I was hesitating to do that because I think it's not the best solution
yet.  At least for 64bit systems, I thought of doing some or all of:

* opening/mapping the pack indexes immediately to avoid locking there
  (perhaps the POC already does this, I haven't looked again).  If you
  have many packs this isn't cheap because the index must be verified.

* mapping small packs immediately

* mapping "the" big pack immediately (many repos will have a huge pack
  from the initial clone)

Put another way, my current concern is that on 64bit systems it's
incredibly easy to share (who cares about a few GBs of mmap()?), whereas
on 32bit systems it probably matters much more, but there we also suffer
more from not sharing.

Am I making any sense?

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]