Re: RFC: Github PR bot questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jun 17, 2021 at 10:55 AM Mauro Carvalho Chehab
<mchehab+huawei@xxxxxxxxxx> wrote:
>
> Em Thu, 17 Jun 2021 10:20:31 +0200
> Dmitry Vyukov <dvyukov@xxxxxxxxxx> escreveu:
>
> > On Thu, Jun 17, 2021 at 8:53 AM Mauro Carvalho Chehab
> > <mchehab+huawei@xxxxxxxxxx> wrote:
> > >
> > > Em Wed, 16 Jun 2021 15:11:33 -0600
> > > Rob Herring <robh@xxxxxxxxxx> escreveu:
> > >
> > > > On Wed, Jun 16, 2021 at 11:18 AM Konstantin Ryabitsev
> > > > <konstantin@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > Hi, all:
> > > > >
> > > > > I've been doing some work on the "github-pr-to-ml" bot that can monitor GitHub
> > > > > pull requests on a project and convert them into fully well-formed patch
> > > > > series. This would be a one-way operation, effectively turning Github into a
> > > > > fancy "git-send-email" replacement. That said, it would have the following
> > > > > benefits for both submitters and maintainers:
> > > >
> > > > What makes this specific to Github PRs? A Github PR is really just a
> > > > git branch plus a target at least to the extent we would use it here.
> > > > The more of this that works on just a git branch, the more widely
> > > > useful it would be.
> > > >
> > > > > - submitters would no longer need to navigate their way around
> > > > >   git-format-patch, get_maintainer.pl, and git-send-email -- nor would need to
> > > > >   have a patch-friendly outgoing mail gateway to properly contribute patches
> > > >
> > > > Presumably, the bot would rely on get_maintainer.pl or it would get
> > > > who to send to based on GH repo and reviewers? Without work on
> > > > get_maintainer.pl, I don't think it will work well beyond simple
> > > > cases.
> > >
> > > Some sanity test is needed, as otherwise it will end by trying to send
> > > the patch to a large number of people.
> >
> > I think this system needs to use get_maintainer.pl results as is and
> > any fixing/filtering/sanity checking needs to go into
> > get_maintainer.pl itself.
> > get_maintainer.pl is what is used by lots of contributors, the only
> > option for any automated systems, what is used by new contributors if
> > they don't use this system anyway. And even experienced developers
> > know internal rules only for a few subsystems and use
> > get_maintainer.pl when sending a one-off patch to another subsystem
> > (what else?).
> >
> > I don't see where we are getting if we accept get_maintainer.pl
> > produces bad results and needs additional fixing in every system out
> > there (dozens) and when used by humans. All systems would need the
> > same filtering/checking rules and they need to keep in sync. What a
> > kernel developer would even need to do to fix something (add/remove
> > themselves)? Go and talk to a large unknown set of systems that
> > duplicate the same additional rules?
> >
> > And the only way to surface actual issues with get_maintainer.pl is to
> > start using it. In fact it's already widely used as is, so I am not
> > sure it's particularly bad.
>
> I'm not saying that get_maintainer.pl produces bad result. Depending
> on what is done, it could produce a very large output.
>
> Let's suppose that someone do something like globally renaming a
> widely-used kAPI, e. g. something like:
>
>         $ git ls-files|xargs sed s,mutex_,new_mutex_, -i
>
> A change like that would touch lots of subsystems, making get_maintainer.pl
> to spend a lot of time processing it, and producing thousands of
> entries (btw, we had a change somewhat similar to the above a long time
> ago when mutex API was introduced and most of the semaphores were converted
> to use mutex kAPI instead).
>
> People that use to work with github/gitlab won't care much on doing a
> change like that on a single patch, but this is something that won't
> work on a PR -> email interface - nor maintainers would like to review
> such big patch touching on multiple subsystems.
>
> So, a bot would need to not only check the size of the patches,
> but also the output of checkpatch.pl.
>
> It will also need to have a timeout to abort checkpatch.pl, not only
> to avoid the bot to become out of service for a long time, but also
> because checkpatch.pl taking more than a minute or so is a good
> indication that the patch is not good - as it is touching too many
> stuff.

Hi Mauro,

Interesting point.

But maybe it still can be incorporated into get_maintiners.pl?... at
least some part of it?
It may be useful for a new contributor who invokes it manually to get
the same "your patch is probably too big, consider splitting it"
message. And any other automated system probably wants the same
threshold for "running too long"/"returning too many emails" w/o
duplicating the values on its side.
Not sure what's the best interface for this... either
get_maintiners.pl can accept an additional flag that will make it fail
for the "no, don't mail this patch automatically" case. Or it can
always produce an additional recognizable message at the end for
"that's too many emails, consider splitting the patch".
What do you think?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux