Re: [PATCH] send-pack: use OBJECT_INFO_QUICK to check negative objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 03, 2019 at 07:55:22PM -0800, Jonathan Nieder wrote:

> > We can fix this by using OBJECT_INFO_QUICK, which tells the lookup
> > code that we expect objects to be missing. Notably, it will not re-scan
> > the packs, and it will use the loose cache from 61c7711cfe (sha1-file:
> > use loose object cache for quick existence check, 2018-11-12).
> 
> On first reading, I wondered how this would interact with alternates,
> since you had mentioned that checking alternates can be expensive.  Does
> this go too far in that direction by treating an object as missing
> whenever it's not in the local object store, even if it's available from
> an alternate?
> 
> But I believe that was a misreading.  With this patch, we still do pay
> the cost of checking alternates for the missing object.  The savings
> is instead about having to *double* check.
> 
> Am I understanding correctly?

Yes, we'd still look in alternates for each object before giving up. The
reason alternates are relevant is that normally if you have (say) 5
alternates, then you have to do 5 syscalls to find out whether each
alternate has an object. And alternates are more likely to be on
high-latency filesystems like NFS, which exacerbates the cost. But with
OBJECT_INFO_QUICK, we'll build an in-memory cache for each alternate
directory (as well as the main object store, of course), rather than
making one request per object.

> > Interestingly, upload-pack does not use OBJECT_INFO_QUICK when it's
> > getting oids from the other side. But I think it could possibly benefit
> > in the same way. Nobody seems to have noticed. Perhaps it simply comes
> > up less, as servers would tend to have more objects than their clients?
> 
> I like to imagine that servers are also more likely to keep a tidy set
> of packs and to avoid alternates.  But using INFO_QUICK when checking
> the fetcher's "have"s does sound like a sensible change to me.

At GitHub we do use alternates (but only one, and on the same local
disk). And our packing situation does sometimes get unwieldy. I think it
might be worth looking into, but it would be nice to have real numbers
before proceeding (likewise we've known about this spot in send-pack,
but it hadn't been expensive enough for anybody to notice; I'll be
curious to see real-world numbers from Patrick's case).

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux