Re: [RFC PATCH 02/10] range-diff.c: don't use st_mult() for signed "int"

Jeff King <peff@xxxxxxxx> · Tue, 14 Dec 2021 09:34:52 -0500

On Fri, Dec 10, 2021 at 01:31:10PM +0100, Ævar Arnfjörð Bjarmason wrote:

> I don't think using ssize_t like that is portable, and that we'd need
> something like intmax_t if we needed this in another context.
> 
> Firstly it's not standard C, it's just in POSIX, intmax_t is standard C
> as of C99, which and we have in-tree code that already depends on it
> (and uintmax_t).
> 
> But more importantly it's not "as big as size_t, just signed" in
> POSIX. size_t is "no greater than the width of type long"[1] and
> LONG_MAX is at least 2^31-1 [2].

Thanks, I didn't know that about ssize_t. I do wonder how often it is
_not_ the case that it is of the same magnitude as size_t. Certainly I
can see how write() could decide to just work in SSIZE_MAX chunks, since
the caller has to be prepared to loop anyway. But it seems like the
obvious implementation is for it to be a signed size_t; I'd be curious
to hear of any platforms that diverge from this (i.e., is this a real
portability concern, or like NULL pointers that aren't all-zeroes, one
that we don't care about in practice).

I do suspect we've already made that assumption elsewhere, though it's
hard to easily see. Grepping for ssize_t turns up lots of reasonable and
legitimate uses. Though some like the one in strbuf_realpath() are
questionable (it's assigning from an int!).

> 3. B.t.w. a thing I ended up ejecting out of this was that I made a
>    "test_commit_bulkier" which is N times faster than "test_commit_bulk",
>    it just makes the same commit N times with the printf-repeating feature
>    and feeds it to fast-import, but the test took so long in any case that
>    I couldn't find a plausible way to get it in-tree).

Yes, I noticed it was rather slow. The main culprit is Git writing out
new blobs and trees for each commit, which is what I assume your
"bulkier" version skipped (the existing "bulk" one is careful not to
use any sub-processes).  You can instruct test_commit_bulk to use
identical content in each commit, which saves a lot of time.

It's also highly non-linear in the number of commits when the tree
changes. That suggests that fast-import's tree-handling could be
improved. Here are the results of the hacky perf script below, showing
both the non-linearity in the "full" case and how much faster the
"quick" (commits-only) case is:

  Test                 this tree        
  --------------------------------------
  1234.2: full 1000    0.35(0.27+0.08)  
  1234.3: full 2000    0.85(0.81+0.04)  
  1234.4: full 4000    3.21(3.09+0.11)  
  1234.5: full 8000    12.13(11.85+0.27)
  1234.6: quick 1000   0.14(0.12+0.02)  
  1234.7: quick 2000   0.20(0.18+0.03)  
  1234.8: quick 4000   0.31(0.28+0.04)  
  1234.9: quick 8000   0.58(0.55+0.03)  

-- >8 --
#!/bin/sh

test_description='foo'
. ./perf-lib.sh

test_expect_success 'empty repo' 'git init'

test_perf 'full 1000' 'test_commit_bulk --id=full 1000'
test_perf 'full 2000' 'test_commit_bulk --id=full 2000'
test_perf 'full 4000' 'test_commit_bulk --id=full 4000'
test_perf 'full 8000' 'test_commit_bulk --id=full 8000'

test_perf 'quick 1000' 'test_commit_bulk --id=quick --filename=foo --contents=bar 1000'
test_perf 'quick 2000' 'test_commit_bulk --id=quick --filename=foo --contents=bar 2000'
test_perf 'quick 4000' 'test_commit_bulk --id=quick --filename=foo --contents=bar 4000'
test_perf 'quick 8000' 'test_commit_bulk --id=quick --filename=foo --contents=bar 8000'

test_done
-- >8 --