Christian Couder <christian.couder@xxxxxxxxx> 于2021年8月18日周三 上午12:10写道: > > Hi, > > On Mon, Aug 16, 2021 at 4:00 PM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > > > Hi, > > > > In the implementation of %(raw) atom > > (bd0708c7 ref-filter: add %(raw) atom), we use xmemdupz() > > to copy the content of the object. But if we can reuse the content > > of the object? > > > > Since git cat-file --batch needs to use ref-filter > > as the backend, if the object buffer can be reused correctly here, > > we can save a lot of copies and improve its performance by about 6%. > > Yeah, that would be great. > > > Tracing back to the source, the object buffer is allocated from > > oid_object_info_extended(), but in parse_object_buffer() we may lose > > the ownership of the buffer (maybe the buffer is eaten), but I browsed the > > source code of for-each-ref.c or cat-file.c, and I don’t seem to find that the > > buffers which have been eaten are released by the program. > > > > So can we reuse it? > > > > This is what I want to do: > > > > diff --git a/ref-filter.c b/ref-filter.c > > index 93ce2a6ef2..1f6c1daabd 100644 > > --- a/ref-filter.c > > +++ b/ref-filter.c > > @@ -1443,7 +1443,7 @@ static void grab_sub_body_contents(struct > > atom_value *val, int deref, struct exp > > unsigned long buf_size = data->size; > > > > if (atom->u.raw_data.option == RAW_BARE) { > > - v->s = xmemdupz(buf, buf_size); > > + v->s = buf; > > It seems to me that this could work only if 'buf' isn't freed. Have > you checked that? Did we leak 'buf' before this patch? Otherwise when > are we freeing it? > This is how I use gdb find out if this buffer have been freed: Breakpoint 1, get_object (ref=0x5555559ad7a0, deref=1, obj=0x7fffffffc930, oi=0x5555559a21e0 <oi_deref>, err=0x7fffffffcb80) at ref-filter.c:1744 1744 { (gdb) n 1746 int eaten = 1; (gdb) 1747 if (oi->info.contentp) { (gdb) 1749 oi->info.sizep = &oi->size; (gdb) 1750 oi->info.typep = &oi->type; (gdb) 1752 if (oid_object_info_extended(the_repository, &oi->oid, &oi->info, (gdb) 1756 if (oi->info.disk_sizep && oi->disk_size < 0) (gdb) 1759 if (oi->info.contentp) { (gdb) p oi->info.contentp $1 = (void **) 0x5555559a2240 <oi_deref+96> (gdb) p oi->content $2 = (void *) 0x5555559afc90 (gdb) b free if $rdi == 0x5555559afc90 Breakpoint 2 at 0x7ffff7e1f980 (gdb) d 1 (gdb) c Continuing. tree 2025c6f1d02348585487ab5597d77151f158325f parent 532c266f73e11df407a8939135f2722c003b7539 author ZheNing Hu <adlternative@xxxxxxxxx> 1628909038 +0800 committer ZheNing Hu <adlternative@xxxxxxxxx> 1628909038 +0800 test2 tree 2025c6f1d02348585487ab5597d77151f158325f parent 532c266f73e11df407a8939135f2722c003b7539 author ZheNing Hu <adlternative@xxxxxxxxx> 1628909038 +0800 committer ZheNing Hu <adlternative@xxxxxxxxx> 1628909038 +0800 test2 [Inferior 1 (process 74283) exited normally] We didn't stop at the breakpoint, this means that we did not free() the memory. Or will this be used as a caching mechanism? I dunno... > > v->s_size = buf_size; > > } else if (atom->u.raw_data.option == RAW_LENGTH) { > > v->s = xstrfmt("%"PRIuMAX, (uintmax_t)buf_size); > > @@ -1768,8 +1768,6 @@ static int get_object(struct ref_array_item > > *ref, int deref, struct object **obj > > } > > > > grab_common_values(ref->value, deref, oi); > > - if (!eaten) > > - free(oi->content); > > This change might not be needed. 'oi->content' seems to come from the > 'buf' above, either directly or through a copy, but if we can indeed > own the 'buf' completely, then we should be able to dispose of it how > we want. > Since we will release the `v->s` (now is the buf) in free_array_item(), if we continue to call free() in get_object(), it may cause double-free. > > return 0; > > } Thanks. -- ZheNing Hu