Re: [BUG] commit fails with 'bus error' when working directory is on an NFS share

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 02, 2024 at 07:48:05PM -0700, Dmitriy Panteleyev wrote:

> > I wonder if building git with:
> >
> >   make SANITIZE=address,undefined
> >
> > and running the same test might yield anything useful.
> 
> Not sure if this is useful, but this is what I got:

Thanks. If you bisect with that command, does it end up on the same
commit?

> AddressSanitizer:DEADLYSIGNAL
> =================================================================
> ==155141==ERROR: AddressSanitizer: BUS on unknown address (pc
> 0x78811e863aed bp 0x7ffe9d5ac800 sp 0x7ffe9d5ac770 T0)
> ==155141==The signal is caused by a READ memory access.
> ==155141==Hint: this fault was caused by a dereference of a high value
> address (see register values below).  Disassemble the provided pc to
> learn which register was used.
>     #0 0x78811e863aed in inflate
> (/lib/x86_64-linux-gnu/libz.so.1+0xfaed) (BuildId:
> bbefe2bbdc367b0c3cfbfcf80c579930496fb963)
>     #1 0x563e32ec7e5f in git_inflate /tmp/git_tests/git/zlib.c:118
>     #2 0x563e32bde431 in unpack_loose_header
> /tmp/git_tests/git/object-file.c:1271
>     #3 0x563e32be429c in loose_object_info /tmp/git_tests/git/object-file.c:1474

Hmm. So we are inflating a loose object. It's mmap()-ed, so presumably
that is why you get the bus error (the underlying nfs system for
whatever reason is not able to provide the bytes).

I'm still super puzzled about why this would start happening, or how it
could be related to that commit. The rest of the stack here:

>     #4 0x563e32be5348 in do_oid_object_info_extended
> /tmp/git_tests/git/object-file.c:1582
>     #5 0x563e32be5dac in oid_object_info_extended
> /tmp/git_tests/git/object-file.c:1640
>     #6 0x563e32be5dac in oid_object_info /tmp/git_tests/git/object-file.c:1656
>     #7 0x563e32bf8b57 in parse_object_with_flags /tmp/git_tests/git/object.c:290

shows that we are coming from parse_object_with_flags(). Is it possible
that calling stat() somehow primes the nfs system to be better able to
serve the mmap'd data? That seems kind of weird.

Maybe one other thing to try. Build with:

  make NO_MMAP=1

(optionally with SANITIZE also). That should replace the mmap calls with
a compat wrapper that just reads into an internal buffer. I suspect that
will make your problem go away, though I'm not sure it gets us any
closer to understanding what's going wrong.

What's the nfs server in your setup? Is it another Linux machine, or is
it some other implementation? Do you know which nfs version?

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux