Re: [RFC PATCH 00/10] range-diff: fix segfault due to integer overflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/12/2021 23:36, Ævar Arnfjörð Bjarmason wrote:
> On Tue, Dec 21 2021, Philip Oakley wrote:
>
>> Sorry for the late comment..
>>
>> On 10/12/2021 14:31, Johannes Schindelin wrote:
>>> Hi Ævar,
>>>
>>> On Thu, 9 Dec 2021, Ævar Arnfjörð Bjarmason wrote:
>>>
>>>> The difference between "master" and "git-for-windows/main" is large
>>>> enough that comparing the two will segfault on my system. This is
>>>> because the range-diff code does some expensive calculations and will
>>>> overflow the "int" type.
>>> You are holding this thing wrong.
>>>
>>> The `main` branch of Git for Windows uses merging rebases, therefore you
>>> need to use a commit range like
>>> `git-for-windows/main^{/^Start.the.merging}..git-for-windows/main` and
>>> compare it to `git-for-windows/main..master`.
>> I'm not sure that a Git repo has an established way of indicating to how
>> it's branching/merging/releasing workflow is set up, especially for
>> projects with non-normative use cases, such as Git for Windows. We don't
>> have a git document for covering  the different workflows in common use
>> for easy reference and consistent terminology.
>>
>> The merging rebase flow, with 'fake' merge does solve a problem that
>> git.git doesn't have but could easily be a common process for 'friendly
>> forks' that follow an upstream with local patches. The choice of
>> '{/^Start.the.merging}' is currently specific to the Git-for-Windows
>> case making it harder to discover this useful maintainer method.
> Yes, but let's not get lost in the weeds here. As I noted I just picked
> GFW as a handy example of a large history & that command as a handy
> example of something that segfaults on "master".

Had you already experienced the segfault locally, without using the GFW
example? How many commits were present in that case?

The GFW example seems like it's taken the discussion in the wrong direction.

For me:
$ git log git/master..origin/main --pretty=oneline | wc -l
62105

That's a lot of commits to have in a range diff. It's almost as big as
the whole of git/master

$ git log git/master --pretty=oneline | wc -l
65400

Personally I'd like a way of trimming 'deadheads' that's a bit easier
that needing to remember Dscho's magic string [1], but time will tell.
> So the point really isn't to say that we should fix range-diff becase
> it'll allow us to run this practically useful command on a git.git fork.
>
>> I fully agree that the range-diff should probably have a patch limit at
>> some sensible value.
> Why would it? If I'm willing to spend the CPU to produce a range-diff of
> an absurdly large range and I've got the memory why shouldn't we support
> it?

There will always be a limit somewhere, and if it's not commit count or
other easily explained & checked limit it will be hard to rationalise
about why Git suddenly fails with an error (or segfault) in those
humungous case.
>
> We don't in cases like xdiff where it's not trivial to just raise the
> limits, but here it seems relatively easy.
>
> I think limits to save users from spending CPU time they didn't expect
> are reasonable, but then we can handle them like the diff/merge rename
> detection limits, i.e. print a warning/advice, and allow the user to
> opt-out.
>
> That also doesn't really apply here since "diff/merge" will/might still
> do something useful in those scenarios, whereas range-diff would just
> have truncated output.
>
>> The 'confusion' between the types size_t, long and int, does ripple
>> through a lot of portable code, as shown in the series. Not an easy problem.
> Yes, although here we're not just casting and overflowing types, but
> overflowing on multiplication and addition, whereas usually we'd just
> overflow on "nr" being too big for "int" or similar.
I've been very slowly looking at the `long` limits on GFW which have
very similar arithmetic issues for pointers, often with no clear answers.

--

Wishing you all the best for Christmas.
Philip

[1]
https://lore.kernel.org/git/01fe28d8-2887-bc42-c91b-c3237b5186a7@iee.email/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux