Re: [PATCH v4 8/8] clean/smudge: allow clean filters to process extremely large files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 02, 2021 at 03:46:11PM +0000, Matt Cooper via GitGitGadget wrote:
> From: Matt Cooper <vtbassmatt@xxxxxxxxx>
>
> The filter system allows for alterations to file contents when they're

Some nit-picking:
looking at
https://git-scm.com/book/en/v2/Customizing-Git-Git-Attributes
we can read
"...substitutions in files on commit/checkout."

Should we use this wording here as well ?


> moved between the database and the worktree. We already made sure that
> it is possible for smudge filters to produce contents that are larger
> than `unsigned long` can represent (which matters on systems where
> `unsigned long` is narrower than `size_t`, most notably 64-bit Windows).
> Now we make sure that clean filters can _consume_ contents that are
> larger than that.
>
> Note that this commit only allows clean filters' _input_ to be larger
> than can be represented by `unsigned long`.
>
> This change makes only a very minute dent into the much larger project
> to teach Git to use `size_t` instead of `unsigned long` wherever
> appropriate.
>
> Helped-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
> Signed-off-by: Matt Cooper <vtbassmatt@xxxxxxxxx>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
> ---
>  convert.c                   |  2 +-
>  t/t1051-large-conversion.sh | 11 +++++++++++
>  2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/convert.c b/convert.c
> index fd9c84b0257..5ad6dfc08a0 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -613,7 +613,7 @@ static int crlf_to_worktree(const char *src, size_t len, struct strbuf *buf,
>
>  struct filter_params {
>  	const char *src;
> -	unsigned long size;
> +	size_t size;
>  	int fd;
>  	const char *cmd;
>  	const char *path;
> diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh
> index e6d52f98b15..042b0e44292 100755
> --- a/t/t1051-large-conversion.sh
> +++ b/t/t1051-large-conversion.sh
> @@ -98,4 +98,15 @@ test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
>  	test "$size" -eq $((5 * 1024 * 1024 * 1024 + $small_size))
>  '
>
> +# This clean filter writes down the size of input it receives. By checking against
> +# the actual size, we ensure that cleaning doesn't mangle large files on 64-bit Windows.
> +test_expect_success EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \
> +		'files over 4GB convert on input' '
> +	test-tool genzeros $((5*1024*1024*1024)) >big &&
> +	test_config filter.checklarge.clean "wc -c >big.size" &&
> +	echo "big filter=checklarge" >.gitattributes &&
> +	git add big &&
> +	test $(test_file_size big) -eq $(cat big.size)
> +'
> +
>  test_done
> --
> gitgitgadget




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux