RE: Non-robust lock files in containers can lead to repo corruption

"Randall S. Becker" <rsbecker@xxxxxxxxxxxxx> · Mon, 12 Aug 2019 09:48:04 -0400

On August 10, 2019 12:06 PM, Gregory Szorc wrote:
> I tracked down a source of Git corrupting repositories to lock file design not
> being robust when containers / PID namespaces are present.
> 
> In my case, the corruption stemmed from premature release of the `git gc`
> lock in the gc.pid file. But since the lock file code for that file is in gc.c, there
> could be other lock files in Git affected by the same design limitation as well.
> 
> The lock design of gc.pid stores the current hostname and PID of the locking
> process in the file. If another process comes along and its hostname matches
> the stored hostname, it checks to see if the listed PID exists. If the PID is
> missing, it assumes the lock is stale and releases the lock.
> 
> A limitation with this approach is it isn't robust in the presence of containers
> / PID namespaces. In containers, it is common for the hostname to match
> the container host's hostname. Or the hostname will be static string. In
> Kubernetes, all containers within a pod share the same hostname. Containers
> (almost always) run in separate PID namespaces, so PIDs from outside the
> container aren't visible to the container itself.
> This means that if e.g. 2 `git gc` processes are running with the same
> hostname in separate containers / PID namespaces, Git could prematurely
> release the lock file because it thinks the "other" PID is dead and repo
> corruption could ensue due to the 2 `git gc` processes racing with each other.
> 
> The on-disk format of lock files obviously needs to be backwards compatible
> with older clients. One backwards compatible solution is to append
> something to the hostname to disambiguate containers / PID namespaces.
> Mercurial appends the current PID namespace identifier to the hostname [1]
> and my experience is that this is sufficient to mitigate the issue. It is possible
> more robust solutions are achievable.

While I like the idea personally, many platforms, including NonStop (TNS/E),  do not support pid namespaces. In particular setns(2) may not be implemented. Please make sure that any changes detect this condition properly and omit the use of namespaces.

Regards,
Randall