On Wed, Aug 18, 2021 at 1:11 PM ZheNing Hu <adlternative@xxxxxxxxx> wrote: > > ZheNing Hu <adlternative@xxxxxxxxx> 于2021年8月18日周三 下午5:07写道: > > I think the 'buf' is not freed in ref-filter logic. > > > > But one thing worth noting is: > > > > parse_object_buffer() may take this buffer away, and store it in > > tree->buffer or use set_commit_buffer() to store it in commit_slab. > > > > So in theory, as long as we don’t use parse_object_buffer(), this > > dynamically allocated memory can be used freely for us. If we use > > parse_object_buffer() and free the buffer, there seems to be a risk, > > but it does not appear so far. > > > > When parse_object_buffer() hints that we don’t free the buffer when > eaten == 1, I have a better idea, If the buffer is "eaten", we copy it, > otherwise we use the originally allocated space. > > -static void grab_sub_body_contents(struct atom_value *val, int deref, > struct expand_data *data) > +static void grab_sub_body_contents(struct atom_value *val, int deref, > struct expand_data *data, int eaten) > { > int i; > const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL; > @@ -1499,7 +1499,10 @@ static void grab_sub_body_contents(struct > atom_value *val, int deref, struct exp > unsigned long buf_size = data->size; > > if (atom->u.raw_data.option == RAW_BARE) { > - v->s = buf; > + if (eaten) > + v->s = xmemdupz(buf, buf_size); Can the 'buf' already be eaten at this point? I thought that it could be eaten only through v->s so after this code. > + else > + v->s = buf; > v->s_size = buf_size; > } else if (atom->u.raw_data.option == RAW_LENGTH) { > v->s = xstrfmt_len(&v->s_size, > "%"PRIuMAX, (uintmax_t)buf_size); > > As parse_object_buffer() does internally: the buffer of commit/tree objects > needs to be copied, but blob/tag not. You said that the number of commits > is generally significantly greater than the others. It seems that we cannot > make full use of this idea. But remember the "skip_parse_buffer" before? > If we skip the parse_object_buffer(), this buffer is also "!eaten". > > In other words, those default paths of git cat-file --batch are included in it! > So from the perspective of performance regression, this patch will be > beneficial. Yeah, it seems that we can benefit from something like this, but it'd be nice if things were clarified a bit more.