Eric Wong <e@xxxxxxxxx> writes: > From: Jeff King <peff@xxxxxxxx> > > Avoid unnecessary round trips to the object store to speed > up cat-file contents retrievals. The majority of packed objects > don't benefit from the streaming interface at all and we end up > having to load them in core anyways to satisfy our streaming > API. What I found missing from the description is something like ... The new trick used is to teach oid_object_info_extended() that a non-NULL oi->contentp that means "grab the contents of the objects here" can be told to refrain from grabbing an object that is too large. > diff --git a/object-file.c b/object-file.c > index 065103be3e..1cc29c3c58 100644 > --- a/object-file.c > +++ b/object-file.c > @@ -1492,6 +1492,12 @@ static int loose_object_info(struct repository *r, > > if (!oi->contentp) > break; > + if (oi->content_limit && *oi->sizep > oi->content_limit) { I cannot convince myself enough to say "content limit" is a great name. It invites "limited by what? text files are allowed but images are not?". > diff --git a/object-store-ll.h b/object-store-ll.h > index c5f2bb2fc2..b71a15f590 100644 > --- a/object-store-ll.h > +++ b/object-store-ll.h > @@ -289,6 +289,7 @@ struct object_info { > struct object_id *delta_base_oid; > struct strbuf *type_name; > void **contentp; > + size_t content_limit; > > /* Response */ > enum { > diff --git a/packfile.c b/packfile.c > index 4028763947..c12a0515b3 100644 > --- a/packfile.c > +++ b/packfile.c > @@ -1529,7 +1529,7 @@ int packed_object_info(struct repository *r, struct packed_git *p, > * We always get the representation type, but only convert it to > * a "real" type later if the caller is interested. > */ > - if (oi->contentp) { > + if (oi->contentp && !oi->content_limit) { > *oi->contentp = cache_or_unpack_entry(r, p, obj_offset, oi->sizep, > &type); > if (!*oi->contentp) > @@ -1555,6 +1555,17 @@ int packed_object_info(struct repository *r, struct packed_git *p, > *oi->sizep = size; > } > } > + > + if (oi->contentp) { > + if (oi->sizep && *oi->sizep < oi->content_limit) { It happens that with the current code structure, at this point, oi->content_limit is _always_ non-zero. But it felt somewhat fragile to rely on it, and I would have appreciated if this was written with an explicit check for oi->content_limit, just like how it is done in loose_object_info() function. Other than that, looking very good. Thanks. > + *oi->contentp = cache_or_unpack_entry(r, p, obj_offset, > + oi->sizep, &type); > + if (!*oi->contentp) > + type = OBJ_BAD; > + } else { > + *oi->contentp = NULL; > + } > + } > } > > if (oi->disk_sizep) {