Re: crash on git diff-tree -Ganything <tree> for new files with textconv filter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2012-10-28 13:01, Jeff King wrote:
> On Sat, Oct 27, 2012 at 08:37:24PM +0200, Peter Oberndorfer wrote:
>
>> It seems "git diff-tree -Ganything <tree>" crashes[1] with a null
>> pointer dereference
>> when run on a commit that adds a file (pdf) with a textconv filter.
>>
>> It can be reproduced with vanilla git by having a commit on top that
>> adds a file with a textconv filter and executing git diff-tree
>> -Ganything HEAD
>> But running git log -Ganything still works without a crash.
>> This problem seems to exist since the feature was first added in f506b8e8b5.
> Thanks for a thorough bug report. I didn't reproduce the crash, but I
> can see how it happens (it happens with diff-tree because we will reuse
> the working tree file in that instance; it could also happen if you
> turned on textconv caching).
>
>> While testing I also noticed the -S and -G act on the original file
>> instead of the textconv munged data.
>> Is this intentional or caused by accessing the wrong data?
> Both, perhaps. :)
Hi,
thanks for your patch for this!

>
> -G operates on the munged data; you can see it feed the munged data to
> xdiff in diff_grep. But the optimization for handling added and removed
> files accidentally fed the wrong pointer. Fixing that is a no-brainer,
> since the optimization is inconsistent with the regular code path.
>
> -S, however, predates the invention of textconv and has never used it.
> It is a little less clear that textconv is the right thing here, because
> it is not about grepping the diff, but about counting occurrences of the
> string inside the file content. I tend to think that doing so on the
> textconv'd data would be what people generally want, but it is a
> behavior change.
>
>> Wild guess: should we really access p->one->data and not mf1.ptr?
> Precisely. Thanks for your wild guess; it made finding the bug very
> easy. :)
>
>> Is there some more information i should provide?
> The patch below should fix it. I added tests, but please try your
> real-world test case on it to double-check.

I tested your patch, but now it crashes for another reason :-)
i have a file with exactly 12288(0x3000) bytes in the repository.
When the file is loaded, the data is placed luckily so the data end
falls at a page boundary.
Later diff_grep() calls regexec() which calls strlen() on the loaded buffer
and ends up reading beyond the actual data into the next page
which is not allocated and causes a pagefault.
Or it could possibly (randomly) match the regex on data that is not
actually part of a file...
Different memory allocation rules on Windows probably also have some
influence here.

My guess is that diff_filespec->data is not supposed to be zero terminated
and we should not invoke strlen() on a not zero terminated data.
But this should be decided by somebody who knows the rules.

Greetings Peter
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]