On Tue, Dec 03, 2019 at 07:55:22PM -0800, Jonathan Nieder wrote: > > We can fix this by using OBJECT_INFO_QUICK, which tells the lookup > > code that we expect objects to be missing. Notably, it will not re-scan > > the packs, and it will use the loose cache from 61c7711cfe (sha1-file: > > use loose object cache for quick existence check, 2018-11-12). > > On first reading, I wondered how this would interact with alternates, > since you had mentioned that checking alternates can be expensive. Does > this go too far in that direction by treating an object as missing > whenever it's not in the local object store, even if it's available from > an alternate? > > But I believe that was a misreading. With this patch, we still do pay > the cost of checking alternates for the missing object. The savings > is instead about having to *double* check. > > Am I understanding correctly? Yes, we'd still look in alternates for each object before giving up. The reason alternates are relevant is that normally if you have (say) 5 alternates, then you have to do 5 syscalls to find out whether each alternate has an object. And alternates are more likely to be on high-latency filesystems like NFS, which exacerbates the cost. But with OBJECT_INFO_QUICK, we'll build an in-memory cache for each alternate directory (as well as the main object store, of course), rather than making one request per object. > > Interestingly, upload-pack does not use OBJECT_INFO_QUICK when it's > > getting oids from the other side. But I think it could possibly benefit > > in the same way. Nobody seems to have noticed. Perhaps it simply comes > > up less, as servers would tend to have more objects than their clients? > > I like to imagine that servers are also more likely to keep a tidy set > of packs and to avoid alternates. But using INFO_QUICK when checking > the fetcher's "have"s does sound like a sensible change to me. At GitHub we do use alternates (but only one, and on the same local disk). And our packing situation does sometimes get unwieldy. I think it might be worth looking into, but it would be nice to have real numbers before proceeding (likewise we've known about this spot in send-pack, but it hadn't been expensive enough for anybody to notice; I'll be curious to see real-world numbers from Patrick's case). -Peff