Re: [PATCH 2/2] object-file: retry linking file into place when occluding file vanishes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 03, 2025 at 09:19:55AM +0100, Patrick Steinhardt wrote:

> In a preceding commit, we have adapted `check_collision()` to ignore

If it's in next and we are building on top, I think we can mention it by
name: 0ad3d65652 (object-file: fix race in object collision check,
2024-12-30).

> the case where either of the colliding files vanishes. This should be
> safe in general when we assume that the contents of these two files were
> the same. But the check is all about detecting collisions, so that
> assumption may be too optimistic.

I found this a little vague about what "too optimistic" means. ;)
Maybe something like:

  Prior to 0ad3d65652, callers could expect that a successful return
  from finalize_object_file() means that either the file was moved into
  place, or the identical bytes were already present. If neither of
  those happens, we'd return an error.

  Since that commit, if the destination file disappears between our
  link() call and the collision check, we'd return success without
  actually checking the contents, and without retrying the link. This
  solves the common case that the files were indeed the same, but it
  means that we may corrupt the repository if they weren't (this implies
  a hash collision, but the whole point of this function is protecting
  against hash collisions).

  We can't be pessimistic and assume they're different; that hurts the
  common case that 0ad3d65652 was trying to fix. But after seeing that
  the destination file went away, we can retry linking again...

> Furthermore, stop treating `ENOENT` specially for the source file. It
> shouldn't happen that the source vanishes as we're using a fresh
> temporary file for it, so if it does vanish it indicates an actual
> error.

OK. I think this is worth doing, but I'd probably have put it into its
own commit.

> @@ -1987,6 +1988,8 @@ static int check_collision(const char *source, const char *dest)
>  	if (fd_dest < 0) {
>  		if (errno != ENOENT)
>  			ret = error_errno(_("unable to open %s"), dest);
> +		else
> +			ret = CHECK_COLLISION_DEST_VANISHED;
>  		goto out;
>  	}

OK. We're depending on error() never using "-2" itself, but I think that
is a reasonable thing.

> @@ -2034,8 +2037,10 @@ int finalize_object_file(const char *tmpfile, const char *filename)
>  int finalize_object_file_flags(const char *tmpfile, const char *filename,
>  			       enum finalize_object_file_flags flags)
>  {
> -	struct stat st;
> -	int ret = 0;
> +	int ret;
> +
> +retry:
> +	ret = 0;
>  
>  	if (object_creation_mode == OBJECT_CREATION_USES_RENAMES)
>  		goto try_rename;
> @@ -2056,6 +2061,8 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename,
>  	 * left to unlink.
>  	 */
>  	if (ret && ret != EEXIST) {
> +		struct stat st;
> +

OK, we move the stat struct here where it's needed. I think that's
somewhat orthogonal to your patch, but reduced scoping does help make
the goto's less confusing.

I suspect there's a way to write this as a loop that would be more
structured, but it would be a bigger refactor. Bonus points if it also
get rid of the try_rename goto, too. ;)

I'm OK punting on that, though.

> @@ -2071,9 +2078,13 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename,
>  			errno = saved_errno;
>  			return error_errno(_("unable to write file %s"), filename);
>  		}
> -		if (!(flags & FOF_SKIP_COLLISION_CHECK) &&
> -		    check_collision(tmpfile, filename))
> +		if (!(flags & FOF_SKIP_COLLISION_CHECK)) {
> +			ret = check_collision(tmpfile, filename);
> +			if (ret == CHECK_COLLISION_DEST_VANISHED)
> +				goto retry;
> +			else if (ret)
>  				return -1;
> +		}
>  		unlink_or_warn(tmpfile);
>  	}

I share Junio's uneasiness with looping forever based on external input
from the filesystem (even though you _should_ eventually win the race,
that's not guaranteed, and of course a weird filesystem might confuse
us). Could we put a stop-gap in it like:

diff --git a/object-file.c b/object-file.c
index 88432cc9c0..262a2f3df2 100644
--- a/object-file.c
+++ b/object-file.c
@@ -2038,6 +2038,7 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename,
 			       enum finalize_object_file_flags flags)
 {
 	int ret;
+	int retries = 0;
 
 retry:
 	ret = 0;
@@ -2080,8 +2081,11 @@ int finalize_object_file_flags(const char *tmpfile, const char *filename,
 		}
 		if (!(flags & FOF_SKIP_COLLISION_CHECK)) {
 			ret = check_collision(tmpfile, filename);
-			if (ret == CHECK_COLLISION_DEST_VANISHED)
+			if (ret == CHECK_COLLISION_DEST_VANISHED) {
+				if (retries++ > 5)
+					return error(_("unable to write repeatedly vanishing file %s"), filename);
 				goto retry;
+			}
 			else if (ret)
 				return -1;
 		}

Otherwise, I think the logic looks good.

-Peff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux