Re: [PATCH v3] Prevent megablobs from gunking up git packs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Dana How <danahow@xxxxxxxxx> writes:

> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 19b0aa1..59be849 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> ...
> @@ -371,8 +372,6 @@ static unsigned long write_object(struct sha1file *f,
>  				pack_size_limit - write_offset : 0;
>  				/* no if no delta */
>  	int usable_delta =	!entry->delta ? 0 :
> -				/* yes if unlimited packfile */
> -				!pack_size_limit ? 1 :
>  				/* no if base written to previous pack */
>  				entry->delta->offset == (off_t)-1 ? 0 :
>  				/* otherwise double-check written to this
> @@ -408,7 +407,7 @@ static unsigned long write_object(struct sha1file *f,
>  		buf = read_sha1_file(entry->sha1, &type, &size);
>  		if (!buf)
>  			die("unable to read %s", sha1_to_hex(entry->sha1));
> -		if (size != entry->size)
> +		if (size != entry->size && type == obj_type)
>  			die("object %s size inconsistency (%lu vs %lu)",
>  			    sha1_to_hex(entry->sha1), size, entry->size);
>  		if (usable_delta) {

I do not quite get how these two hunks relate to the topic of
this patch.  Care to enlighten?

> @@ -564,6 +563,17 @@ static off_t write_one(struct sha1file *f,
>  			return 0;
>  	}
>  
> +	/* refuse to include as many megablobs as possible */
> +	if (max_blob_size && e->size >= max_blob_size) {
> +		struct stat st;
> +		/* skip if unpacked, remotely packed, or loose anywhere */
> +		if (!e->in_pack || !e->in_pack->pack_local || find_sha1_file(e->sha1, &st)) {
> +			e->offset = (off_t)-1;	/* might drop reused delta base if mbs less */
> +			written++;
> +			return offset;
> +		}
> +	}
> +
>  	e->offset = offset;
>  	size = write_object(f, e, offset);
>  	if (!size) {

I thought that you are simply ignoring the "naughty blobs"---why
should it be done this late in the call sequence?  I haven't
followed the existing code nor your patch closely, but I wonder
why the filtering is simply done inside (or by the caller of)
add_object_entry().  You would need to do sha1_object_info()
much earlier than the current code does, though.

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux