Re: [PATCH] revision: introduce prepare_revision_walk_extended()

René Scharfe <l.s.r@xxxxxx> · Mon, 18 Dec 2017 20:18:19 +0100

Am 18.12.2017 um 16:10 schrieb Jeff King:
> On Sat, Dec 16, 2017 at 01:12:16PM +0100, René Scharfe wrote:
> 
>> prepare_revision_walk() allows callers to take ownership of the array of
>> pending objects by setting the rev_info flag "leak_pending" and copying
>> the object_array "pending".  They use it to clear commit marks after
>> setup is done.  This interface is brittle enough that it requires
>> extensive comments.
>>
>> Provide an easier way by adding a function that can hand over the array
>> to a caller-supplied output parameter and converting all users of the
>> flag "leak_pending" to call prepare_revision_walk_extended() instead.
> 
> I think this is _better_, but it's still kind of a funny interface.
> 
> The root of the matter is that the revision-walking code doesn't clean
> up after itself. In every case, the caller is just saving these to clean
> up commit marks, isn't it?

bundle also checks if the pending objects exists.

> Could we instead have an interface like:
> 
>    revs.clear_commit_marks = 1;
>    prepare_revision_walk(&revs);
>    ...
>    finish_revision_walk(&revs);
> 
> where that final function would do any cleanup, including clearing the
> commit marks. I suspect there are other small bits that get leaked
> because there's not really any "destructor" for a revision walk.
> 
> It's not as flexible as this whole "make a copy of the pending tips"
> thing, but it keeps all of the details abstracted away from the callers.
> 
> Alternatively:
> 
>> +`prepare_revision_walk_extended`::
>> +
>> +	Like prepare_revision_walk(), but allows callers to take ownership
>> +	of the array of pending objects by passing an object_array pointer
>> +	as the second parameter; passing NULL clears the array.
> 
> What if we just got rid of this function and had callers do:
> 
>    object_array_copy(&old_pending, &revs);
>    prepare_revision_walk(&revs);
>    ...
>    clear_commit_marks_for_object_array(&old_pending);
> 
> That sidesteps all of the memory ownership issues by just creating a
> copy. That's less efficient, but I'd be surprised if it matters in
> practice (we tend to do one or two revisions per process, there don't
> tend to be a lot of pending tips, and we're really just talking about
> copying some pointers here).

This was done before I added the leak_pending flag.

t5502 and t5571 have test cases with ca. 1000 pending objects, t5551
and t5541 with ca. 2000, p5310 more than 8000.  That's just a few KB.

I don't know if there can be real-world use cases with millions of
entries (when it would start to hurt).

Why does prepare_revision_walk() clear the list of pending objects at
all?  Assuming the list is append-only then perhaps remembering the
last handled index would suffice.

René