Re: [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content?

ZheNing Hu <adlternative@xxxxxxxxx> · Wed, 18 Aug 2021 19:11:22 +0800




ZheNing Hu <adlternative@xxxxxxxxx> 于2021年8月18日周三 下午5:07写道：
> I think the 'buf' is not freed in ref-filter logic.
>
> But one thing worth noting is:
>
> parse_object_buffer() may take this buffer away, and store it in
> tree->buffer or use set_commit_buffer() to store it in commit_slab.
>
> So in theory, as long as we don’t use parse_object_buffer(), this
> dynamically allocated memory can be used freely for us. If we use
> parse_object_buffer() and free the buffer, there seems to be a risk,
> but it does not appear so far.
>

When parse_object_buffer() hints that we don’t free the buffer when
eaten == 1, I have a better idea, If the buffer is "eaten", we copy it,
otherwise we use the originally allocated space.

-static void grab_sub_body_contents(struct atom_value *val, int deref,
struct expand_data *data)
+static void grab_sub_body_contents(struct atom_value *val, int deref,
struct expand_data *data, int eaten)
 {
        int i;
        const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
@@ -1499,7 +1499,10 @@ static void grab_sub_body_contents(struct
atom_value *val, int deref, struct exp
                        unsigned long buf_size = data->size;

                        if (atom->u.raw_data.option == RAW_BARE) {
-                               v->s = buf;
+                               if (eaten)
+                                       v->s = xmemdupz(buf, buf_size);
+                               else
+                                       v->s = buf;
                                v->s_size = buf_size;
                        } else if (atom->u.raw_data.option == RAW_LENGTH) {
                                v->s = xstrfmt_len(&v->s_size,
"%"PRIuMAX, (uintmax_t)buf_size);

As parse_object_buffer() does internally: the buffer of commit/tree objects
needs to be copied, but blob/tag not. You said that the number of commits
is generally significantly greater than the others. It seems that we cannot
make full use of this idea. But remember the "skip_parse_buffer" before?
If we skip the parse_object_buffer(), this buffer is also "!eaten".

In other words, those default paths of git cat-file --batch are included in it!
So from the perspective of performance regression, this patch will be
beneficial.

Thanks.
--
ZheNing Hu