Re: [Bug] Git ReadOnly Temp Packfile Causes "Bad file descriptor" And -13 Access Error With NFSv4

"brian m. carlson" <sandals@xxxxxxxxxxxxxxxxxxxx> · Mon, 10 Feb 2025 22:32:24 +0000

On 2025-02-10 at 15:56:59, Maloney, Bryan wrote:
> ### Error
> Kernel logs:
> ```
> NFSv4: state recovery failed for open file pack/tmp_pack_aR0Mu3, error = -13
> ```
> Git clone output:
> ```
> fatal: write error: Bad file descriptor, 137.31 MiB | 45.77 MiB/s
> fatal: fetch-pack: invalid index-pack output
> ```
> 
> 
> ### Context
> 
> The following error is seen when running git clone over NFSv4 and a failover, or server restart, occurs:
> ```
> NFSv4: state recovery failed for open file pack/tmp_pack_aR0Mu3, error = -13
> ```
> This error is an access denied error that happens when you try to open a file with insufficient permissions. In this case the file being opened is a read only file and it is attempted to be opened with write access.
> 
> Git opens/creates this file with the O_RDWR flag but then applies read only permissions to it, 0444. Since the permissions are changed after the file is opened, the file handle works fine. However if the file was attempted to be re-opened with that same file handle we would see a -13 error. This is what we see following a failover in NFSv4. When clients reclaim their open files, the NFS server re-evaluates the file access.

Your description of the problem is spot on.  We intentionally set the
permissions to 0444 because we never want anyone to change loose object
files or packs, since doing so would corrupt the repository.  This
behaviour is specifically allowed by POSIX[0]:

  The argument following the oflag argument does not affect whether the
  file is open for reading, writing, or for both.

POSIX does not allow the re-evaluation of file system access once the
file is open, so it sounds like your file system is not POSIX compliant,
and Git generally requires lots of POSIX-compliant functionality from
the file system. For instance, we also require the POSIX consistency
guarantees[1], among myriad others:

  If a read() of file data can be proven (by any means) to occur after a
  write() of the data, it must reflect that write(), even if the calls
  are made by different threads. A similar requirement applies to
  multiple write operations to the same file position.

The implicit violation of that particular requirement is why cloud
syncing services often corrupt the repository.

Could you adjust your NFSv4 server such that is synchronizes state among
the primary and replicas in case of a required failover?  I know we have
people successfully using Git with NFS without problems, although this
particular issue does often hit non-POSIX-compliant NFS implementations
in a variety of ways.  (This particular variant is new to me, though.)

> This is an issue for active/passive HA file servers. Since NFSv4 evaluates file permissions at the time of opening a file, this FD will always get an access denied error if a failover occurs during git clone.

I'm not sure there's even a good way to solve this problem on the Git
side, since I suspect that if we opened the file as 0644 and then
immediately did an fchmod to 0444, if you'd still fail here if the file
is reopened.  Is that correct?

I'll also point out that there's a variety of other software that does
the same thing as Git does, including zsh and Emacs, so fixing this in
Git doesn't really fix the entire problem that your NFS server has,
since all of that other software will also be broken in at least some
cases and require similar workarounds.  (I discovered this with a
simple, 30-second search on GitHub some time back.)  As far as I'm
aware, all other Git implementations also do the same thing as Git does,
so you'd also need to patch go-git, libgit2, and every other
implementation as well.

[0] https://pubs.opengroup.org/onlinepubs/9799919799/functions/open.html
[1] https://pubs.opengroup.org/onlinepubs/9799919799/functions/write.html
-- 
brian m. carlson (they/them or he/him)
Toronto, Ontario, CA
Attachment:
signature.asc

Description: PGP signature