Re: [External] Re: [PATCH v1] negotiator/default.c: avoid stack overflow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 24, 2023 at 10:44 PM Derrick Stolee
<derrickstolee@xxxxxxxxxx> wrote:
>
> > This is the same case as [1].
> >
> > 1. https://lore.kernel.org/git/20221025232934.1504445-1-jonathantanmy@xxxxxxxxxx/
>
> Thanks for the link, though this could be replaced with
>
>   4654134976f (negotiator/skipping: avoid stack overflow, 2022-10-25)
>
> now that the change exists in the commit history.

make sense.

>
> One thing that is missing from that change is a test, and such a test
> could be generalized to apply to all negotiators. This could maybe
> help any potential future negotiator avoid this bug. Did you think
> about what such a test could look like? Perhaps test_commit_bulk
> could help, but we'd probably need to create so many commits that the
> test would need to be marked as expensive. That's probably a major
> reason to not include a test and rely on avoiding recursion when
> possible.

I first found this issue in a large repository with numerous merge commits.
To address it, I added a test case which fast-imports 10,000 commits and
runs them through run_with_limited_stack(). Although expensive, this
approach was successful in executing the test case without any issues.

>
> > -     if (commit != NULL && !(commit->object.flags & COMMON)) {
> > +     struct prio_queue queue = { NULL };
> > +
> > +     prio_queue_put(&queue, commit);
>
> Should we check the conditions what were removed? The COMMON flag
> is likely only useful for the recursion, but prio_queue_put() is
> not careful about NULL values. However, no callers should be
> providing NULL commits here.
>
> Couldn't hurt to add
>
>         if (!commit)
>                 return;

make sense.

>
> before the prio_queue_put().
>
> > +     while ((commit = prio_queue_get(&queue))) {
> >               struct object *o = (struct object *)commit;
> >
> > +             if (commit == NULL || (commit->object.flags & COMMON))
> > +                     continue;
>
> The NULL condition is definitely unnecessary here as it is checked
> by the while condition. The "& COMMON" is helpful if the commit
> gained the COMMON flag after being inserted into the queue.
>
> >               if (!ancestors_only)
> >                       o->flags |= COMMON;
> >
>
>
> > @@ -70,15 +76,17 @@ static void mark_common(struct negotiation_state *ns, struct commit *commit,
> >                               ns->non_common_revs--;
> >                       if (!o->parsed && !dont_parse)
> >                               if (repo_parse_commit(the_repository, commit))
> > -                                     return;
> > +                                     continue;
> >
> > +                     ancestors_only = 0;
>
> This caught me off guard, but this flag essentially says "should
> I mark the first commit as common or not?". It would probably be
> clearer if this was done before the loop, and then was ignored
> within the loop, setting the flag on each parent in this loop:
>
> >                       for (parents = commit->parents;
> >                                       parents;
> >                                       parents = parents->next)
> > -                             mark_common(ns, parents->item, 0,
> > -                                         dont_parse);
> > +                             prio_queue_put(&queue, parents->item);
>

I'll think about how to optimize this again.

ancestors_only is used multiple times in the original logic:
1.
              if (!ancestors_only)
                     o->flags |= COMMON;
2.
             if (!(o->flags & SEEN))
                     rev_list_push(ns, commit, SEEN);
             else {
                     struct commit_list *parents;

                     if (!ancestors_only && !(o->flags & POPPED))
                             ns->non_common_revs--;

Should we use this ?

             if (!ancestors_only) {
                    commit->object.flags |= COMMON;

                    if ((commit->object.flags & SEEN) &&
!(commit->object.flags & POPPED))
                             ns->non_common_revs--;
             }

and

                   for (parents = commit->parents;
                             parents;
                             parents = parents->next) {
                             if (parents->item->object.flags & COMMON)
                                      continue;

                            parents->item->object.flags |= COMMON;

                            if ((parents->item->object.flags & SEEN)
                                     && !(parents->item->object.flags & POPPED))
                                      ns->non_common_revs--;

                            prio_queue_put(&queue, parents->item);
                   }

> It would have an extra benefit: your walk may duplicate objects in the
> priority queue (there is no duplicate protection in prio_queue_put).
> But, we could use
>
>         if (!(parents->item->object.flags & COMMON)) {
>                 parents->item->object.flags |= COMMON;
>                 prio_queue_put(&queue, parents->item);
>         }
>
> as duplicate protection _and_ a clearer way to demonstrate what
> ancestors_only is doing. Without this protection, it is possible
> to have exponential growth in the priority queue using simple
> merge commits.
>
> You'd need this at the beginning:
>
>         if (!commit)
>                 return;
>
>         prio_queue_put(&queue, commit);
>         if (!ancestors_only)
>                 commit->object.flags |= COMMON;

Make sense.

> > diff --git a/negotiator/skipping.c b/negotiator/skipping.c
> > index c7d6ab39bc..3d262b3533 100644
> > --- a/negotiator/skipping.c
> > +++ b/negotiator/skipping.c
> > @@ -108,6 +108,8 @@ static void mark_common(struct data *data, struct commit *seen_commit)
> >                               prio_queue_put(&queue, p->item);
> >               }
> >       }
> > +
> > +     clear_prio_queue(&queue);
>
> This memory leak cleanup in the skipping negotiator is good to
> do, but should be split into its own change.
>
> In addition, the mark_common() method there seems to have a few
> problems:
>
>  1. It does not do duplicate protection before prio_queue_put().
>     (The COMMON bit would work here, too.)
>  2. When it translated from recursive to iterative it kept "return"
>     statements that should probably be "continue" statements.
>  3. It does not attempt to parse commits, and instead returns
>     immediately when finding an unparsed commit. This is something
>     that it did in its original version, so maybe it is by design,
>     but it doesn't match the doc comment for the method.
>
> Consider fixing these issues while you are here.
>

Make sense.

Thanks.
-Han Xin




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux