Re: [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christian Couder <christian.couder@xxxxxxxxx> 于2021年8月20日周五 下午11:58写道:
>
> On Wed, Aug 18, 2021 at 1:11 PM ZheNing Hu <adlternative@xxxxxxxxx> wrote:
> >
> > ZheNing Hu <adlternative@xxxxxxxxx> 于2021年8月18日周三 下午5:07写道:
> > > I think the 'buf' is not freed in ref-filter logic.
> > >
> > > But one thing worth noting is:
> > >
> > > parse_object_buffer() may take this buffer away, and store it in
> > > tree->buffer or use set_commit_buffer() to store it in commit_slab.
> > >
> > > So in theory, as long as we don’t use parse_object_buffer(), this
> > > dynamically allocated memory can be used freely for us. If we use
> > > parse_object_buffer() and free the buffer, there seems to be a risk,
> > > but it does not appear so far.
> > >
> >
> > When parse_object_buffer() hints that we don’t free the buffer when
> > eaten == 1, I have a better idea, If the buffer is "eaten", we copy it,
> > otherwise we use the originally allocated space.
> >
> > -static void grab_sub_body_contents(struct atom_value *val, int deref,
> > struct expand_data *data)
> > +static void grab_sub_body_contents(struct atom_value *val, int deref,
> > struct expand_data *data, int eaten)
> >  {
> >         int i;
> >         const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
> > @@ -1499,7 +1499,10 @@ static void grab_sub_body_contents(struct
> > atom_value *val, int deref, struct exp
> >                         unsigned long buf_size = data->size;
> >
> >                         if (atom->u.raw_data.option == RAW_BARE) {
> > -                               v->s = buf;
> > +                               if (eaten)
> > +                                       v->s = xmemdupz(buf, buf_size);
>
> Can the 'buf' already be eaten at this point? I thought that it could
> be eaten only through v->s so after this code.
>

I think where the 'buf' is eaten is parse_object_buffer(). set_commit_buffer()
or parse_tree_buffer() cache the 'buf'. 'eaten' means we don't need to free it
manually. In grab_sub_body_contents() we can indeed write the address of
'buf' to v->s, but what I worry about is that before we completely write v->s to
the output buffer through append_atom(), will this buf be freed in
somewhere else?

> > +                               else
> > +                                       v->s = buf;
> >                                 v->s_size = buf_size;
> >                         } else if (atom->u.raw_data.option == RAW_LENGTH) {
> >                                 v->s = xstrfmt_len(&v->s_size,
> > "%"PRIuMAX, (uintmax_t)buf_size);
> >
> > As parse_object_buffer() does internally: the buffer of commit/tree objects
> > needs to be copied, but blob/tag not. You said that the number of commits
> > is generally significantly greater than the others. It seems that we cannot
> > make full use of this idea. But remember the "skip_parse_buffer" before?
> > If we skip the parse_object_buffer(), this buffer is also "!eaten".
> >
> > In other words, those default paths of git cat-file --batch are included in it!
> > So from the perspective of performance regression, this patch will be
> > beneficial.
>
> Yeah, it seems that we can benefit from something like this, but it'd

Only when the data allocated to us is dynamically allocated and we do have
the ownership of it, we can benefit by reducing copies and allocate
memory again.

> be nice if things were clarified a bit more.

OK. In get_object(), eaten means that the oi->content we obtained is cached
by git in parse_object_buffer(). This means that we don't need to free this buf
manually. In 28511adfa5 ([GSOC] ref-filter: skip parse_object_buffer()), we skip
parse_buffer() when using some atoms, which can avoid some meaningless parsing.
So under this path, our'buf' is not cached. This means we have
ownership of it, so we
can free it freely, and benefit from this.

Thanks.
--
ZheNing Hu




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux