Re: [PATCH 2/3] diff_populate_filespec: NUL-terminate buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 05, 2016 at 05:45:06PM +0200, Johannes Schindelin wrote:

> It is true that many code paths populate the mmfile_t structure silently
> appending a NUL, e.g. when running textconv on a temporary file and
> reading the results back into an strbuf.
> 
> The assumption is most definitely wrong, however, when mmap()ing a file.
> 
> Practically, we seemed to be lucky that the bytes after mmap()ed memory
> were 1) accessible and 2) somehow contained NUL bytes *somewhere*.
> 
> In a use case reported by Chris Sidi, it turned out that the mmap()ed
> file had the precise size of a memory page, and on Windows the bytes
> after memory-mapped pages are in general not valid.
> 
> This patch works around that issue, giving us time to discuss the best
> course how to fix this problem more generally.

I don't know if we are in that much of a rush. This bug has been around
for many years (the thread I linked earlier is from 2012). Yes, it's bad
and annoying, but we can probably spend a few days discussing the
solution.

> diff --git a/diff.c b/diff.c
> index 534c12e..32f7f46 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2826,6 +2826,15 @@ int diff_populate_filespec(struct diff_filespec *s, unsigned int flags)
>  			s->data = strbuf_detach(&buf, &size);
>  			s->size = size;
>  			s->should_free = 1;
> +		} else {
> +			/* data must be NUL-terminated so e.g. for regexec() */
> +			char *data = xmalloc(s->size + 1);
> +			memcpy(data, s->data, s->size);
> +			data[s->size] = '\0';
> +			munmap(s->data, s->size);
> +			s->should_munmap = 0;
> +			s->data = data;
> +			s->should_free = 1;
>  		}

Without having done a complete audit recently, my gut and my
recollection from previous discussions is that regexec() really is the
culprit here for the diff code[1]. If we are going to do a workaround
like this, I think we could limit it only to cases where know it
matters, like --pickaxe-regex.

Can it be triggered with -G? I thought that operated on the diff content
itself, which would always be in a heap buffer (which should be NUL
terminated, but if it isn't, that would be a separate fix from this).

-Peff

[1] We do make the assumption elsewhere that git objects are
    NUL-terminated, but that is enforced by the object-reading code
    (with the exception of streamed blobs, but those are obviously dealt
    with separately anyway).



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]