Re: [PATCH v3 3/9] finalize_object_file(): implement collision check

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Taylor Blau <me@xxxxxxxxxxxx> writes:

> Note that this may cause some extra computation when the files are in
> fact identical, but this should happen rarely. For example, when writing
> a loose object, we compute the object id first, then check manually for
> an existing object (so that we can freshen its timestamp) before
> actually trying to write and link the new data.

True.

> +static int check_collision(const char *filename_a, const char *filename_b)
> +{
> +	char buf_a[4096], buf_b[4096];
> +	int fd_a = -1, fd_b = -1;
> +	int ret = 0;
> +
> +	fd_a = open(filename_a, O_RDONLY);
> +	if (fd_a < 0) {
> +		ret = error_errno(_("unable to open %s"), filename_a);
> +		goto out;
> +	}
> +
> +	fd_b = open(filename_b, O_RDONLY);
> +	if (fd_b < 0) {
> +		ret = error_errno(_("unable to open %s"), filename_b);
> +		goto out;
> +	}

Two and two half comments on this function.

 * We compare 4k at a time here, while copy.c copies 8k at a time,
   and bulk-checkin.c uses 16k at a time.  Outside the scope of this
   topic, we probably should pick one number and stick to it, unless
   we have measured to pick perfect number for each case (and I know
   I picked 8k for copy.c and 16k for bulk-checkin.c both out of
   thin air).

 * I would have expected at least we would fstat() them to declare
   difference immediately after we find their sizes differ, for
   example.  As we assume that calling into this function should be
   rare, we prefer not to pay in complexity for performance here?

 * We use read_in_full() and assume that a short-read return from
   the function happens only at the end of file due to EOF, which is
   another reason why we can do away without fstat() on these files.

 * An error causes the caller to assume collision (because we assign
   the return value of error() to ret), which should do the same
   action as an actual collision to abort and keep the problematic
   file for forensics.

>  /*
>   * Move the just written object into its final resting place.
>   */
> @@ -1941,8 +1992,8 @@ int finalize_object_file(const char *tmpfile, const char *filename)
>  			errno = saved_errno;
>  			return error_errno(_("unable to write file %s"), filename);
>  		}
> -		/* FIXME!!! Collision check here ? */
> -		unlink_or_warn(tmpfile);
> +		if (check_collision(tmpfile, filename))
> +			return -1;
>  	}

OK.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux