Re: [PATCH v4 tip/perf/core 0/4] uprobes,mm: speculative lockless VMA-to-uprobe lookup

Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> · Thu, 21 Nov 2024 06:43:28 -0800

cc'ing Liao for awareness (should have done it earlier, sorry)

On Thu, Nov 21, 2024 at 1:33 AM Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
>
> * Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>
> > Is this about Liao's siglock patch set? We are at v4 (!) already (see
> > [0]) with Oleg's Acked-by added.
>
> AFAICS you didn't Cc: me to -v3 and -v4 - and while I'll generally see
> those patches too, eventually, there's a delay.

Ok, I think we are now switching to my patch set here ([0]), because
Liao's v4 ([1]) does have mingo@xxxxxxxxxx in CC. So, on Liao's
behalf, there wasn't really anything specific pointed out that would
explain a month's delay.

But let's switch to my patch set. Yes, my bad, I didn't CC you
directly, that wasn't in any way intentional, and that's my bad and I
will make sure to CC you on every patch for uprobes subsystem, even
though you are not explicitly listed in UPROBES section of MAINTAINERS
([2]), and Peter was the one who was handling all the uprobe-related
stuff since before this summer.

But let's please not randomly jump between discussing two separate
patch sets here, it's confusing.

>
> > > Andrii did get some other uprobes scalability work merged in v6.13:
> > >
> > >     - Switch to RCU Tasks Trace flavor for better performance
> > >     (Andrii Nakryiko)
> > >
> > >     - Massively increase uretprobe SMP scalability by
> > >       SRCU-protecting the uretprobe lifetime (Andrii Nakryiko)
> > >
> > > So we've certainly not been ignoring his patches, to the contrary
> > > ...
> >
> > Yes, and as I mentioned, this one is a) ready, reviewed, tested and
> > b) complements the other work you mention.
>
> Sorry, but patchsets that didn't even build a few weeks before the
> development window closed are generally pushed further down the
> backlog. Think of this as rate-limiting the risk of potentially broken
> code entering the kernel. You can avoid this problem by doing more
> testing, or by accepting that sometimes one more cycle is needed to get
> your patchsets merged.

Those build failures were happening in stub implementations, which
were used only on !CONFIG_PER_VMA_LOCK configuration, which I'm not
even sure if possible to get on x86-64. When I tried to reproduce this
locally I couldn't even get such configuration. Thankfully we have a
kernel test robot that would test multiple architectures and
configurations and it did catch it on i386 and loongarch64. And was
fixed quickly.

It doesn't seem fair or reasonable to penalize patch sets for *weeks*,
silently and without any notice just for this, IMO.

The patch set was very thoroughly tested, actually. Not just building,
but also running various unit tests (BPF selftests in particular). But
even more so, I built an entire uprobe stress-testing tool just to
test all my uprobe-related. I deployed custom kernels and ran these
stress tests on all uprobe patch sets and their revisions, over many
hours.

Sure, my main platform is x86-64, so that's where all the testing was
done. But you can't accuse me of negligence.

>
> > [...] It removes mmap_lock which limits scalability of the rest of
> > the work. Is there some rule that I get to land only two patch sets
> > in a single release?
>
> Your facetous question and the hostile tone of your emails is not
> appreciated.

I'm sticking to the facts in these emails. And when I get a response
in the style of "you got two patch sets in, why are you complaining",
that's not exactly friendly and fair. I put a lot of effort and time
not just into producing and testing all those patches, but also into
the logistics of it, coordinating with other people working within
uprobes subsystem.

And instead of accusations, I'd like to get an understanding of
expectations I can have in terms of handling patches. Being ignored
for many weeks is not OK. If you don't like something about what I do
or how, procedurally or technically, please call it out and I'll try
to fix whatever the problem might be. Silent treatment is not
productive.

But while on the topic. Those two patch sets you mentioned didn't go
in smoothly and quickly either. "Switch to RCU Tasks Trace flavor for
better performance" in particular should have gone in with the
original patch set one release earlier. But instead that patch was
dropped from the tree after applying it. Silently. I was not notified
at all (5 days that passed before I asked would be plenty of time to
mention this, IMO). It's good I noticed this, inquired with an email
reply (after making sure it's not some transient patch handling
issue), and only after that I got a reply that there was a build
failure I had to fix. You can see the thread at [4].

>
> Me pointing out that two other patchsets of yours got integrated simply
> demonstrates how your original complaint of an 'ignore list' is not
> just unprofessional on its face, but also demonstrably unfair:
>
> > > > > I'm not sure what's going on here, this patch set seems to be
> > > > > in some sort of "ignore list" on Peter's side with no
> > > > > indication on its destiny.

My "ignore list" complaint is specifically about this patch set, which
I explicitly stated above in the quote you provided. So yes, it's a
professional, and demonstrably fair statement, and I provided the
timeline and supporting links.

>
> Trying to pressure maintainers over a patchset that recently had build
> failures isn't going to get your patches applied faster.
>

I'm not asking to apply my patches blindly without critical review or
anything like that. I'm not expecting reviews or even just email
replies within a few days of posting. I *do* expect some sort of
reaction, though, and not after many weeks of inactivity and pinging
from my side, yes. And note, I got replies only after sending an email
straight to Linus.

I'm not pressuring anyone into anything. But as a maintainer myself, I
do think that being a maintainer is not just about having rights, it's
also about responsibilities.

Let's please stop with the excuses and instead discuss constructive solutions.

> Thanks,
>
>         Ingo

[0] https://lore.kernel.org/linux-trace-kernel/20241028010818.2487581-1-andrii@xxxxxxxxxx/
[1] https://lore.kernel.org/linux-trace-kernel/20241022073141.3291245-1-liaochang1@xxxxxxxxxx/
[2] https://lore.kernel.org/all/172074397710.247544.17045299807723238107.stgit@devnote2/
[3] https://github.com/libbpf/libbpf-bootstrap/commit/2f88cef90f9728ec8c7bee7bd48fdbcf197806c3
[4] https://lore.kernel.org/all/CAEf4BzZihPPiReE3anhrVOzjoZW5v4vFVouK_Arm8vJexCTT4g@xxxxxxxxxxxxxx/