Am 07.05.22 um 14:33 schrieb Philip Oakley: > > > On 7 May 2022 03:15:00 BST, Jason Hatton <jhatton@xxxxxxxxxxxxxxxxxxx> wrote: > > Philip Oakley <philipoakley@iee.email> writes: > > This may treat non-zero multiple of 4GiB as "not racy", but has > anybody double checked the concern Réne brought up earlier that a > 4GiB file that was added and then got rewritten to 2GiB within the > same second would suddenly start getting treated as not racy? > > This is the pre-existing problem, that ~1in 2^31 size changes might not > get noticed for size change. The 0 byte / 4GiB change is an identical > issue, as is changing from 3 bytes to 4GiB+3 bytes, etc., so that's no > worse than before (well maybe twice as 'unlikely'). > > > OK, it added one more case to 2^32-1 existing cases, I guess. > > The patch (the firnal version of it anyway) needs to be accompanied > by a handful of test additions to tickle corner cases like that. > > They'd be protected by the EXPENSIVE prerequisite I would assume. > > > Oh, absolutely. Thanks for spelling that out. > > > I have been testing out the patch a bit and have good and (mostly) bad news. > > What works using a munge value of 1. > > $ git add > $ git status > > Racy seems to work. > > $ touch .git/index 4GiB # 4GiB is now racy > $ git status # Git will rehash the racy file > $ git status # Git cached the file. Second status is fast. > > What doesn't work. > > $ git checkout 4GiB > $ fatal: packed object is corrupt! > > Using a munge value of 1<<31 causes even more problems. The file hash in the > index for 4GiB files (git ls-files -s --debug) are set to the zero file hash. > > I looked up and down the code base and couldn't figure out how the munged > value was leaking out of read-cache.c and breaking things. Most of the code > I found tends to use stat and then convert that to a size_t, not using the > munged unsigned int at all. > > Maybe someone else will have better luck. This seems over my head :( > > Thanks > -- > Jason > > > Is this on Git for Windows or a 64 bit Linux? > There are still some issues on GfW for 2GiB+ files (long Vs long long int). Which would explain the zero file hash. And make the platform unfit for handling big files at all at this time. FWIW, on MacOS I get this with the patch applied: $ git init --quiet /tmp/a $ cd /tmp/a $ : >size-0 $ dd if=/dev/zero bs=1 oseek=4294967295 count=1 of=size-4294967296 1+0 records in 1+0 records out 1 bytes transferred in 0.000365 secs (2740 bytes/sec) $ dd if=/dev/zero bs=1 oseek=4294967296 count=1 of=size-4294967297 1+0 records in 1+0 records out 1 bytes transferred in 0.000293 secs (3413 bytes/sec) $ dd if=/dev/zero bs=1 oseek=6442450943 count=1 of=size-6442450944 1+0 records in 1+0 records out 1 bytes transferred in 0.000266 secs (3759 bytes/sec) $ git add size-* $ git commit -m initial [master (root-commit) d9c2a0a] initial 4 files changed, 0 insertions(+), 0 deletions(-) create mode 100644 size-0 create mode 100644 size-4294967296 create mode 100644 size-4294967297 create mode 100644 size-6442450944 $ time git checkout size-* Updated 0 paths from the index git checkout size-* 0.01s user 0.01s system 65% cpu 0.020 total $ git ls-files -s --debug | grep size 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 0 size-0 size: 0 flags: 0 100644 451971a31ea5a207a10b391df2d5949910133565 0 size-4294967296 size: 2147483648 flags: 0 100644 3eb7feb1413c757f0d8181deb28d1dab03d64846 0 size-4294967297 size: 1 flags: 0 100644 741285bddfa7863072c238f34e27144c2501832d 0 size-6442450944 size: 2147483648 flags: 0 So checkout skips all of the files and their cached sizes have the expected values. René