Re: disabling sha1dc unaligned access, was Re: One failed self test on Fedora 29

Jeff King <peff@xxxxxxxx> · Tue, 12 Mar 2019 07:05:49 -0400

On Tue, Mar 12, 2019 at 09:53:41AM +0100, Ævar Arnfjörð Bjarmason wrote:

> There's a at least a couple of aspects to this.
> 
> One is whether we should have the submodule in
> sha1collisiondetection/. I agree that's probably a bad idea now
> per-se. Honestly I wasn't expecting the answer when I submitted the
> final patch to switch to it fully to be to the effect of submodules
> being too immature for the git project itself to use. So now we're
> effectively mid-series, and should maybe just back out.

I think it's especially funky because we have three different ways of
getting sha1dc (in-tree, submodule, or against an external library). And
I almost blindly submitted a patch making the in-tree version work
(since that's what's used by default, and what I use) which could have
totally broken things for the other use cases without anybody realizing
until the change trickled down to somebody who uses those flags.

(Technically in this case it wouldn't actually have _broken_ them, but
just not helped them, so they'd be no worse off. But hopefully you get
the point).

Speaking of external libraries, in some ways the issue I raised is no
different than it is for any external library, where we're at the mercy
of whatever version is on the system. The big dependency for us is
usually libcurl, and we do have to sometimes work around old versions
there.

But I do think there's one thing that make the sha1dc submodule approach
more painful is that we don't control the content of the code, but we
_do_ build it ourselves with our usual compiler flags. So we're weirdly
intimate with it (and in fact, an external library would not have the
problem being discussed here, since it would have been built separately
without UBSan).

> I fully agree with what you've said in theory, but if we look at what's
> happened in practice we as a project are demonstrably not disciplined
> enough to manage upstream code like this without overtly perma-forking
> it.

I'm not sure I agree completely. Most of the things we've imported are
small enough that we're reasonably happy to accept them as a snapshot in
time and take ownership. I.e., I do not recall a lot of instances of
fixing bugs in compat/regex or compat/poll that we could have gotten
more easily by merging from upstream. But I admit I don't actually pay
much attention to those areas, so I might be completely off-base.

The one place I really _would_ have liked to remain compatible with
upstream is xdiff. And we were traditionally pretty hesitant to clean
things up there for fear of diverging. But in practice, upstream there
has been stagnant, and we've done most of the bug fixes and improvements
to it (in-tree).

> As far as I can tell none of the people changing that code went through
> the process of submitting a parallel upstream fix or seeing if the issue
> was fixed upstream and we could just update the code we were carrying,
> and of course that gets progressively harder for any one contributor as
> our divergence grows.

To be clear, I do sympathize with the notion that not pulling things
in-tree keeps our relationship with upstream more disciplined, and that
has value. I'm just not altogether clear how much it's really hurt us
overall to be undisciplined.

-Peff