Hi, On Mon, Aug 16, 2021 at 4:00 PM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > Hi, > > In the implementation of %(raw) atom > (bd0708c7 ref-filter: add %(raw) atom), we use xmemdupz() > to copy the content of the object. But if we can reuse the content > of the object? > > Since git cat-file --batch needs to use ref-filter > as the backend, if the object buffer can be reused correctly here, > we can save a lot of copies and improve its performance by about 6%. Yeah, that would be great. > Tracing back to the source, the object buffer is allocated from > oid_object_info_extended(), but in parse_object_buffer() we may lose > the ownership of the buffer (maybe the buffer is eaten), but I browsed the > source code of for-each-ref.c or cat-file.c, and I don’t seem to find that the > buffers which have been eaten are released by the program. > > So can we reuse it? > > This is what I want to do: > > diff --git a/ref-filter.c b/ref-filter.c > index 93ce2a6ef2..1f6c1daabd 100644 > --- a/ref-filter.c > +++ b/ref-filter.c > @@ -1443,7 +1443,7 @@ static void grab_sub_body_contents(struct > atom_value *val, int deref, struct exp > unsigned long buf_size = data->size; > > if (atom->u.raw_data.option == RAW_BARE) { > - v->s = xmemdupz(buf, buf_size); > + v->s = buf; It seems to me that this could work only if 'buf' isn't freed. Have you checked that? Did we leak 'buf' before this patch? Otherwise when are we freeing it? > v->s_size = buf_size; > } else if (atom->u.raw_data.option == RAW_LENGTH) { > v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size); > @@ -1768,8 +1768,6 @@ static int get_object(struct ref_array_item > *ref, int deref, struct object **obj > } > > grab_common_values(ref->value, deref, oi); > - if (!eaten) > - free(oi->content); This change might not be needed. 'oi->content' seems to come from the 'buf' above, either directly or through a copy, but if we can indeed own the 'buf' completely, then we should be able to dispose of it how we want. > return 0; > }