Re: [PATCH 0/8] Introduce fwctl subystem

Dan Williams <dan.j.williams@xxxxxxxxx> · Wed, 5 Jun 2024 21:56:14 -0700

Jason Gunthorpe wrote:
[..]
> > 3 years on from that recommendation it seems no vendor has even needed
> > that level of distribution help. I.e. checking a few distro kernels
> > (Fedora, openSUSE) shows no uptake for CONFIG_CXL_MEM_RAW_COMMANDS=y in
> > their debug builds. I can only assume that locally compiled custom
> > kernel binaries are filling the need.
> 
> My strong advice would be to be careful about this. Android-ism where
> nobody runs the upstream kernel is a real thing. For something
> emerging like CXL there is a real risk that the hyperscale folks will
> go off and do their own OOT stuff and in-tree CXL will be something
> usuable but inferior. I've seen this happen enough times..

Hence my openness to considering fwctl...

> If people come and say we need X and the maintainer says no, they
> don't just give up and stop doing X, the go and do X anyhow out of
> tree. This has become especially true now that the center of business
> activity in server-Linux is driven by the hyperscale crowd that don't
> care much about upstream.

"...don't care much about upstream...". This could be a whole separate
thread unto itself.

> Linux maintainer's don't actually have the power to force the industry
> to do things, though people do keep trying.. Maintainers can only
> lead, and productive leading is not done with a NO.
> 
> You will start to see this pain in maybe 5-10 years if CXL starts to
> be something deployed in an enterprise RedHat/Dell/etc sort of
> environment. Then that missing X becomes a critical issue because it
> turns out the hyperscale folks long since figured out it is really
> important but didn't do anything to enable it upstream.

This matches other feedback I have heard recently. Yes, distros hate
contending with every vendor's userspace toolkit, that was the original
distro feedback motivating CONFIG_CXL_MEM_RAW_COMMANDS to have a poison
pill of WARN() on use. However, allowing more vendor commands is more
preferable than contending with vendor out-of-tree drivers that likely
help keep the enterprise-distro-kernel stable-ABI train rolling. In
other words, legalize it in order to centrally regulate it.

[..]
> This is my effort here. If we document the expectations there is a
> much better chance that a standard body or device manufacturer can
> implement their interfaces in a way that works with the OS. There is a
> much higher chance they will attract CVEs and be forced to fix it if
> the security expectations are clearly laid out. You had a good
> observation in one of those links about how they are not OS
> people. Let's help them do better.
> 
> Shunt the less robust stuff to fwctl and then people can also make
> their own security choices, don't enable or load the fwctl modules and
> you get more protection. It is closer to your
> CONFIG_CXL_MEM_RAW_COMMANDS=y but at runtime.
> 
> I think I captured most of your commentary below here in patch 6.

I will take a look...

> >   Effects Log". In that "trust Command Effects" scenario the kernel still
> >   has no idea what the command is actually doing, but it can at least
> >   assert that the device does not claim that the command changes the
> >   contents of system-memory. Now, you might say, "the device can just
> >   lie", but that betrays a conceit of the kernel restriction. A device
> >   could lie that a Linux wrapped command when passed certain payloads does
> >   not in turn proxy to a restricted command.
> 
> Yeah, we have to trust the device. If the device is hostile toward the
> OS then there are already big problems. We need to allow for
> unintentional defects in the devices, but we don't need to be
> paranoid.
> 
> IMHO a command effects report, in conjunction with a robust OS centric
> defintion is something we can trust in.

So this is where I want to start and see if we can bridge the trust gap.

I am warming to your assertion that there is a wide array of
vendor-specific configuration and debug that are not an efficient use of
upstream's time to wrap in a shared Linux ABI. I want to explore fwctl
for CXL for that use case, I personally don't want to marshal a Linux
command to each vendor's slightly different backend CXL toggles.

At the same time, I also agree with the contention that a "do anything
you want and get away with it" tunnel invites shenanigans from folks
that may not care about the long term health of the Linux kernel vs
their short term interests. That it is difficult to unring the bell once
a tunnel is in place. While subsystems will rightly take different
stances to fwctl policy, that lack of one-size-fits all seems not
sufficient reason to keep the concept out of the kernel entirely.

I appreciate that you crafted this interface with an eye towards making
it unsuitable for data-path operations.

So my questions to try to understand the specific sticking points more
are:

1/ Can you think of a Command Effect that the device could enumerate to
address the specific shenanigan's that netdev is worried about? In other
words if every command a device enables has the stated effect of
"Configuration Change after Reset" does that cut out a significant
portion of the concern? Make this a debate on finer grained effects not
coarse grained binary decision on whether fwctl should move forward at
all.

2/ About the "what if the device lies?" question. We can't revert code
that used to work, but we can definitely work with enterprise distros to
turn off fwctl where there is concern it may lead or is leading to
shenanigans. So, document what each subsystem's stance towards fwctl is,
like maybe a distro only wants fwctl to front publicly documented vendor
commands, or maybe private vendor commands ok, but only with a
constrained set of Command Effects (I potentially see CXL here). A
distro should know what they are opting into for each fwctl instance, it
likely will always need to be subsystem specific policy. A distro can
also decide lockdown policy based on Command Effects above and beyond
the ones that clearly state they allow the device to modify the running
kernel.