"Matt Cooper via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Matt Cooper <vtbassmatt@xxxxxxxxx> > > The filter system allows for alterations to file contents when they're > added to the database or workdir. ("Smudge" when moving to the workdir; > "clean" when moving to the database.) This is used natively to handle CRLF > to LF conversions. It's also employed by Git-LFS to replace large files > from the workdir with small tracking files in the repo and vice versa. Not a huge deal, but make it a habit to spell "working tree" not "workdir", as someday you'd write end-user facing documentation in our tree ;-). > Git pulls the entire smudged file into memory. Giving "for what" would be helpful to readers. Git reads the entire smudged file into memory to convert it into a "clean" form to be used in-core. > While this is inefficient, > there's a more insidious problem on some platforms due to inconsistency > between using unsigned long and size_t for the same type of data (size of > a file in bytes). On most 64-bit platforms, unsigned long is 64 bits, and > size_t is typedef'd to unsigned long. On Windows, however, unsigned long is > only 32 bits (and therefore on 64-bit Windows, size_t is typedef'd to > unsigned long long in order to be 64 bits). > > Practically speaking, this means 64-bit Windows users of Git-LFS can't > handle files larger than 2^32 bytes. Other 64-bit platforms don't suffer > this limitation. > > This commit introduces a test exposing the issue; future commits make it > pass. The test simulates the way Git-LFS works by having a tiny file > checked into the repository and expanding it to a huge file on checkout. > > Helped-by: Johannes Schindelin <johannes.schindelin@xxxxxx> > Signed-off-by: Matt Cooper <vtbassmatt@xxxxxxxxx> > Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx> > --- > t/t1051-large-conversion.sh | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/t/t1051-large-conversion.sh b/t/t1051-large-conversion.sh > index 8b7640b3ba8..bff86c13208 100755 > --- a/t/t1051-large-conversion.sh > +++ b/t/t1051-large-conversion.sh > @@ -83,4 +83,18 @@ test_expect_success 'ident converts on output' ' > test_cmp small.clean large.clean > ' > > +# This smudge filter prepends 5GB of zeros to the file it checks out. This > +# ensures that smudging doesn't mangle large files on 64-bit Windows. > +test_expect_failure EXPENSIVE,SIZE_T_IS_64BIT,!LONG_IS_64BIT \ > + 'files over 4GB convert on output' ' > + test_commit test small "a small file" && > + test_config filter.makelarge.smudge \ > + "test-tool genzeros $((5*1024*1024*1024)) && cat" && > + echo "small filter=makelarge" >.gitattributes && > + rm small && > + git checkout -- small && > + size=$(test_file_size small) && > + test "$size" -ge $((5 * 1024 * 1024 * 1024)) > +' Why not exactly 5G, but anything that is at least 5G is OK? Thanks. > test_done