Re: [PATCH] mktree: learn about promised objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15/06/2022 05:00, Jeff King wrote:
> On Tue, Jun 14, 2022 at 08:35:16PM -0400, Taylor Blau wrote:
> 
>> On Tue, Jun 14, 2022 at 01:27:18PM -0400, Derrick Stolee wrote:
>>>> Did you have any other sort of performance test in mind? The remotes we
>>>> typically deal with are geographically far away and deal with a high volume
>>>> of traffic so we're keen to move behaviour to the client where it makes sense
>>>> to do so.
>>>
>>> I guess I wonder how large your promisor pack-files are in this test,
>>> since your implementation depends on for_each_packed_object(), which
>>> should be really inefficient if you're actually dealing with a large
>>> partial clone.
>>
>> I had the same thought. Storing data available in the promisor packs
>> into an oid_map is going to be expensive if there are many such objects.
>>
>> Is there a reason that we can't introduce a variant of
>> find_kept_pack_entry() that deals only with .promisor packs and look
>> these things up as-needed?
> 
> It's much worse than that. The promisor mechanism is fundamentally very
> inefficient in runtime, optimizing instead for size. Imagine I have a
> partial clone and I retrieve tree X, which points to a blob Y that I
> don't get. I have X in a promisor pack, and asking about it is
> efficient. But if I want to know about Y, I have no data structure
> mentioning Y except the tree X itself. So to enumerate all of the
> promisor edges, I have to walk all of the trees in the promisor pack.
> 
> So it is not just lookup, but actual tree walking that is expensive. The
> flip side is that you don't have to store a complete separate list of
> the promised objects. Whether that's a win depends on how many local
> objects you have, versus how many are promised.
> 
> But it would be possible to cache the promisor list to make the tradeoff
> separately. E.g., do the walk over the promisor trees once (perhaps at
> pack creation time), and store a sorted list of fixed-length (oid, type)
> records that could be binary searched. You could even put it in the
> .promisor file. :)
> 
> -Peff

I like the idea of caching the promisor list at pack creation time;
I'll start work on a patch set that implements this.

Meanwhile, is it worth considering a '--promised-as-missing' option
(or a config option) for invocations such as 'mktree --missing' that
prevents promised objects being faulted-in? Currently, the only
reliable way that I've found to prevent 'mktree --missing' faulting-in
promised objects is to remove the remote. Such an option could either
set the global variable 'fetch_if_missing' to '0' or could ensure
'OBJECT_INFO_SKIP_FETCH_OBJECT' is passed appropriately.

Cheers,
Richard



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux