Re: [RFC PATCH v2 1/5] range-diff: zero out elements in "cost" first

Jeff King <peff@xxxxxxxx> · Tue, 14 Dec 2021 08:36:29 -0500

On Fri, Dec 10, 2021 at 01:30:38PM +0100, Ævar Arnfjörð Bjarmason wrote:

> Have the get_correspondences() function use CALLOC_ARRAY instead of
> looping over the newly allocated "cost" to zero out some of its
> elements.
> 
> The for-loop that zero'd out elements in "cost" wasn't the first loop
> in that function, nor did it cover all of its elements, but regardless
> of that this change doesn't affect its end-state. None of the
> for-loops touched the same items in the array, so we could have (and
> an earlier WIP version of this change did) moved the for-loop we're
> deleting to come first in get_correspondences().
> 
> This can be seen from a careful reading of this code added in in
> d9c66f0b5bf (range-diff: first rudimentary implementation,
> 2018-08-13) (as well as adding some printf-debugging) we'll set all
> elements in the in the "n * n" allocated array. That's "n = A+B" where
> "A" and "B" are the number of commits in our respective ranges.

It took me several readings to understand this argument. I think it's
just:

  When setting up the 2-dimensional "cost" array, we first fill in all
  rows 0..a->nr, and then all columns 0..b->nr. After that, we zero the
  rest of the array: slots which are both in rows greater than a->nr
  _and_ columns greater than b->nr.

  We can instead just zero everything from the start with
  CALLOC_ARRAY() and skip the final zero-ing.

I'm not necessarily asking for a re-roll with the message, but just
re-stating it to make sure I understand what's going on.

> So let's just allocate this with CALLOC_ARRAY(), and skip these two
> for-loops.

The first part tells us it's not wrong to do this change. But it doesn't
give us a "why". Is it just simpler code? We think it might be faster to
calloc than hand-rolling?

The "two for-loops" confused me for a minute. Really there are two
separate cases:

  - we can calloc the "cost" array and drop the final zero-ing loop in
    get_correspondences(), per the argument above

  - we can calloc the a2b and b2a arrays, which saves
    compute_assignment() from doing it itself later. Though this is a
    little weird, since it may zero them or it may set all entries to
    -1.

Given the large explanation needed for the first one (above), and then
this totally distinct explanation for the second one:

> Furthermore let's remove the early exit condition from
> compute_assignment() in favor of an assert that it must be called with
> "column_count > 1", then "get_correspondences()" can skip calling it
> when no further filling of the "a2b" and "b2a" arrays is needed.

...it kind of feels like they ought to be separate patches. At the very
least, it would be nice if the two cases were laid out more clearly at
the start of the commit message.

I do find the movement of this column_count check strange. It feels like
it's breaking the encapsulation of compute_assignment(); now its caller
knows more about when it might zero things and when it might do an
actual computation. And it makes things more complicated (we have a
conditional in the sole caller _and_ we have an assert to make sure the
caller checked that conditional).

But maybe this is really important for some later refactor. I'll keep
reading...

-Peff