Re: [PATCH v2 7/8] diff: add ability to insert additional headers for paths

Johannes Altmanninger <aclopte@xxxxxxxxx> · Fri, 31 Dec 2021 04:07:46 +0100

On Thu, Dec 30, 2021 at 02:04:30PM -0800, Elijah Newren wrote:
> >                                 enqueue(file_pair_for(extra_headers[j]))
> 
> The queue is an array of sorted items, so enqueue here would be
> insertion into an already sorted list.  Inserting N items into a list
> of M items is quadratic (O(N*M)) -- unless you meant to just append to
> the end and add a third sort at the end?

yeah I would have probably used a third sort

> 
> >                         j++
> 
> At the end of the for loop, there may be remaining additional headers
> that sort after all those found in the queue, so you'll need an
> additional loop to handle those.

my bad, I should have tried it

> It's actually considerably more code as you can see from the diffstat,
> and feels like we're reaching into some ugly internals with tmp_queue
> (the SWAP and the special-case freeing) in order to get the desired
> performance improvements.  And it was already O(NlogN) overall (due to
> the sort), which doesn't change with this new algorithm.  It's really,
> really hard for me to imagine a case where we have large numbers of
> additional headers.  Even if someone else can imagine that we for some
> reason have a huge number of conflicts in order to generate a huge
> number of additional headers...how could the performance of sorting
> O(N) filenames and merging these lists possibly matter in comparison
> to the O(N) three-way file merges that would likely have been
> performed from those conflicts?

Yeah, I agree with that conclusion, it's surely not worth the added complexity.
Seeing the code definitely helps, thanks.

> 
> So, I'm going to throw this code away and keep the original.
> 
> It was an interesting idea and exercise; thanks for keeping me on my toes.