Re: [PATCH v2 0/2] cipso: make cipso_v4_skbuff_delattr() fully remove the CIPSO options

Ondrej Mosnacek <omosnace@xxxxxxxxxx> · Fri, 26 Jul 2024 14:44:09 +0200

On Thu, Jun 20, 2024 at 4:39 PM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
>
> On Thu, Jun 20, 2024 at 6:03 AM Ondrej Mosnacek <omosnace@xxxxxxxxxx> wrote:
> > On Wed, Jun 19, 2024 at 4:46 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > > On June 14, 2024 11:08:41 AM Paul Moore <paul@xxxxxxxxxxxxxx> wrote:
> > > > On Fri, Jun 14, 2024 at 3:20 AM <patchwork-bot+netdevbpf@xxxxxxxxxx> wrote:
> > > >>
> > > >> Hello:
> > > >>
> > > >> This series was applied to netdev/net.git (main)
> > > >> by David S. Miller <davem@xxxxxxxxxxxxx>:
> > > >
> > > > Welp, that was premature based on the testing requests in the other
> > > > thread, but what's done is done.
> > > >
> > > > Ondrej, please accelerate the testing if possible as this patchset now
> > > > in the netdev tree and it would be good to know if it need a fix or
> > > > reverting before the next merge window.
> > >
> > > Ondrej, can you confirm that you are currently working on testing this
> > > patchset as requested?
>
> [NOTE: adding SELinux list as a FYI for potential breakage in upcoming kernels]
>
> > Not really... I tried some more to get cloud-init to work on FreeBSD,
> > but still no luck...
>
> As mentioned previously, if you aren't able to fit the testing into
> your automated framework, you'll need to do some manual testing to
> verify the patches.

Sigh... okay, I now did test the scenario with a FreeBSD system as B
and it passed.

> > Anyway, I still don't see why I should waste my
> > time on testing this scenario, since you didn't provide any credible
> > hypothesis on why/what should break there.
>
> I did share my concern about changes in packet length across the
> network path and an uncertainty about how different clients might
> react.  While you tested with Linux based systems, I requested that
> you test with at least one non-Linux client to help verify that things
> are handled properly.
>
> Perhaps you don't view that concern as credible, but it is something
> I'm worried about as a common use case is for non-Linux clients to
> connect over an unlabeled, single label/level network to a Linux
> gateway which then routes traffic over different networks, some with
> explicit labeling.
>
> If you don't believe that testing this is important Ondrej, trust
> those who have worked with a number of users who have deployed these
> types of systems that this is important.

I'm not saying the concern is not credible or that (in general)
testing this use case is not important. What I'm missing is some
explanation/reasoning that would make me think "Oh yeah, these patches
really could break this scenario". You said something about consistent
IP header overhead and bidirectional stream-based connections, but I
don't understand how the former could cause an issue with the latter.
Does it violate some specification? And I'm not arguing that there
isn't a possible bug because I don't see it; I'm just arguing that if
there is a mechanism through which the change could cause a bug in
this scenario, you should be able to explain it (at least roughly) to
someone who doesn't see it there.

>
> > Convince me that there is a
> > valid concern and I will be much more willing to put more effort into
> > it.
>
> I've shared my concerns with you, both in previous threads and now in
> this thread.  This really shouldn't be about convincing you to do The
> Right Thing and verify that your patch doesn't break existing users,
> it should be about you wanting to do The Right Thing so your work
> doesn't break the kernel.
>
> > You see something there that I don't, and I'd like to see and
> > understand it, too. Let's turn it from *your* concern to *our* concern
> > (or lack of it) and then the cooperation will work better.
>
> It's not about you or I, it's about all of the users who rely on this
> functionality and not wanting to break things for them.
>
> Test your patches Ondrej, if you don't you'll find me increasingly
> reluctant to accept anything from you in any of the trees I look
> after.

Paul, I don't want to break the kernel, but that doesn't mean I will
do an excessive amount of work for someone else when there doesn't
seem to be a logical reason to do so. IMHO, just because someone
somewhere has a special hard-to-test use case that is very important
to them doesn't mean that it is your job as a community project
maintainer to force other contributors to do work to defend these
peoples' use cases. The fact that we don't see any effort from them to
have their use case tested upstream (they could either auto-run their
tests on patches themselves and mail back the results or provide the
test code and/or infrastructure and ask maintainers + contributors to
run the tests on patches before they are merged) means that one of the
following is true:
1. They do care about upstream not breaking their use case, but expect
that upstream will automatically ensure that "for free".
2. They accept the fact that upstream may break their use case and
they rely on testing at a later stage to find issues and
reporting/fixing the bugs once they are found. Usually this is because
they have calculated / assume that at the given time, the cost of
implementing/facilitating testing on upstream level is higher than the
cost of finding the bugs later and waiting for the upstream kernel to
become usable again.
In both cases it doesn't make sense for the community to self-impose
the need to substitute the lack of effort from the consumer side, and
much less so to push that effort to others (which will just drive
contributors away, which is a far worse consequence than possibly
breaking some complex scenario once in a while). However, as soon as
someone comes and says "Hey, we have this test, this is how you can
run it. Can you make sure patches touching X are tested with it? If
it's too much of a hassle, let us help to make it work.", it's an
entirely different situation and we (both contributors and
maintainers) should very much do our best to fulfill the request.

This is my idea of how the fine balance in the
user-maintainer-contributor relationship could work best. I'm writing
it here so that you can understand where I'm coming from and perhaps
ponder about it a bit. But the main point here is that you requested a
complicated scenario to be tested without adequately explaining what
you assume to be possibly broken. And I also pointed out that the
behavior you seem to think can cause breakage is already present in
the current code - you didn't answer that. So I feel like you
arbitrarily push some high-effort requirement on me and that's not
nice.

--
Ondrej Mosnacek
Senior Software Engineer, Linux Security - SELinux kernel
Red Hat, Inc.