Re: [PATCH v2 02/10] packfile: allow content-limit for cat-file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Eric Wong <e@xxxxxxxxx> writes:

> From: Jeff King <peff@xxxxxxxx>
>
> Avoid unnecessary round trips to the object store to speed
> up cat-file contents retrievals.  The majority of packed objects
> don't benefit from the streaming interface at all and we end up
> having to load them in core anyways to satisfy our streaming
> API.

What I found missing from the description is something like ...

    The new trick used is to teach oid_object_info_extended() that a
    non-NULL oi->contentp that means "grab the contents of the objects
    here" can be told to refrain from grabbing an object that is too
    large.

> diff --git a/object-file.c b/object-file.c
> index 065103be3e..1cc29c3c58 100644
> --- a/object-file.c
> +++ b/object-file.c
> @@ -1492,6 +1492,12 @@ static int loose_object_info(struct repository *r,
>  
>  		if (!oi->contentp)
>  			break;
> +		if (oi->content_limit && *oi->sizep > oi->content_limit) {

I cannot convince myself enough to say "content limit" is a great
name.  It invites "limited by what?  text files are allowed but
images are not?".

> diff --git a/object-store-ll.h b/object-store-ll.h
> index c5f2bb2fc2..b71a15f590 100644
> --- a/object-store-ll.h
> +++ b/object-store-ll.h
> @@ -289,6 +289,7 @@ struct object_info {
>  	struct object_id *delta_base_oid;
>  	struct strbuf *type_name;
>  	void **contentp;
> +	size_t content_limit;
>  
>  	/* Response */
>  	enum {
> diff --git a/packfile.c b/packfile.c
> index 4028763947..c12a0515b3 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1529,7 +1529,7 @@ int packed_object_info(struct repository *r, struct packed_git *p,
>  	 * We always get the representation type, but only convert it to
>  	 * a "real" type later if the caller is interested.
>  	 */
> -	if (oi->contentp) {
> +	if (oi->contentp && !oi->content_limit) {
>  		*oi->contentp = cache_or_unpack_entry(r, p, obj_offset, oi->sizep,
>  						      &type);
>  		if (!*oi->contentp)
> @@ -1555,6 +1555,17 @@ int packed_object_info(struct repository *r, struct packed_git *p,
>  				*oi->sizep = size;
>  			}
>  		}
> +
> +		if (oi->contentp) {
> +			if (oi->sizep && *oi->sizep < oi->content_limit) {

It happens that with the current code structure, at this point,
oi->content_limit is _always_ non-zero.  But it felt somewhat
fragile to rely on it, and I would have appreciated if this was
written with an explicit check for oi->content_limit, just like how
it is done in loose_object_info() function.

Other than that, looking very good.

Thanks.


> +				*oi->contentp = cache_or_unpack_entry(r, p, obj_offset,
> +								      oi->sizep, &type);
> +				if (!*oi->contentp)
> +					type = OBJ_BAD;
> +			} else {
> +				*oi->contentp = NULL;
> +			}
> +		}
>  	}
>  
>  	if (oi->disk_sizep) {




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux