RE: RFC: Restricting userspace interfaces for CXL fabric management

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Perhaps a bit more color on a few specifics might be helpful.

I think that there will always be a class of vendor specific APIs/Opcodes
that are related to an implementation of a standard instead of the
standard itself.  I've been party to discussion on not creating CXL
defined API/Opcodes that get into the realm of specifying an
implementation.  There are also a class of data that can be collected from
a specific implementation that is helpful for debug, for health
monitoring, and perhaps performance monitoring where the implementation
matters and therefore are not easily abstracted to a standard.

A few examples:
a) Temperature monitoring of a component or internal chip die
temperatures.  Could CXL define a standard OpCode to gather temperatures,
yes it could; but is this really part of CXL?  Then how many temperature
elements and what does each element mean?  This enters into the
implementation and therefore is vendor specific.  Unless the CXL spec
starts to define the implementation, something along the lines of "thou
shall have an average die temperature, rather than specific temperatures
across a die", etc.

b) Error counters, metrics, internal counters, etc.  Could CXL define a
set of common error counters, absolutely.  PCIe has done some of this.
However, a specific implementation may have counters and error reporting
that are meaningful only to a specific design and a specific
implementation rather than a "least common denominator" approach of a
standard body.

c) Performance counters, metric, indicators, etc.  Performance can be very
implementation specific and tweaking performance is likely to be
implementation specific.  Yes, generic and a least common denominator
elements could be created, but are likely to limiting in realizing the
maximum performance of an implementation.

d) Logs, errors and debug information.  In addition to spec defined
logging of CXL topology errors, specific designs will have logs, crash
dumps, debug data that is very specific to a implementation.  There are
likely to be cases where a product that conforms to a specification like
CXL, may have features that don't directly have anything to do with CXL,
but where a standards based management interface can be used to configure,
manage, and collect data for a non-CXL feature.

e) Innovation.  I believe that innovation should be encouraged.  There may
be designs that support CXL, but that also incorporate unique and
innovative features or functions that might service a niche market.  The
AI space is ripe for innovation and perhaps specialized features that may
not make sense for the overall CXL specification.

I think that in most cases Vendor specific opcodes are not used to
circumvent the standards, but are used when the standards group has no
interested in driving into the standard certain features that are clearly
either implementation specific or are vendor specific additions that have
a specific appeal to a select class of customer, but yet are not relevant
to a specific standard.

At the end of the day, customer want products that solve a specific
problem.  Sometimes vendor can address market segments or niches that a
standard group has no interest in supporting.  It can also take months,
and in some cases years to reach an agreement on what standardized feature
should look like.  I also believe that there can be competitive reasons
why there might be a group that wants to slow down a vendor's
implementation for fear of losing market share.

Thanks
Harold Johnson


-----Original Message-----
From: Jonathan Cameron [mailto:Jonathan.Cameron@xxxxxxxxxx]
Sent: Friday, April 26, 2024 11:54 AM
To: Dan Williams
Cc: linux-cxl@xxxxxxxxxxxxxxx; Sreenivas Bagalkote; Brett Henning; Harold
Johnson; Sumanesh Samanta; linux-kernel@xxxxxxxxxxxxxxx; Davidlohr Bueso;
Dave Jiang; Alison Schofield; Vishal Verma; Ira Weiny;
linuxarm@xxxxxxxxxx; linux-api@xxxxxxxxxxxxxxx; Lorenzo Pieralisi; Natu,
Mahesh; gregkh@xxxxxxxxxxxxxxxxxxx
Subject: Re: RFC: Restricting userspace interfaces for CXL fabric
management

On Fri, 26 Apr 2024 09:16:44 -0700
Dan Williams <dan.j.williams@xxxxxxxxx> wrote:

> Jonathan Cameron wrote:
> [..]
> > To give people an incentive to play the standards game we have to
> > provide an alternative.  Userspace libraries will provide some
incentive
> > to standardize if we have enough vendors (we don't today - so they
will
> > do their own libraries), but it is a lot easier to encourage if we
> > exercise control over the interface.
>
> Yes, and I expect you and I are not far off on what can be done
> here.
>
> However, lets cut to a sentiment hanging over this discussion. Referring
> to vendor specific commands:
>
>     "CXL spec has them for a reason and they need to be supported."
>
> ...that is an aggressive "vendor specific first" sentiment that
> generates an aggressive "userspace drivers" reaction, because the best
> way to get around community discussions about what ABI makes sense is
> userspace drivers.
>
> Now, if we can step back to where this discussion started, where typical
> Linux collaboration shines, and where I think you and I are more aligned
> than this thread would indicate, is "vendor specific last". Lets
> carefully consider the vendor specific commands that are candidates to
> be de facto cross vendor semantics if not de jure standards.
>

Agreed. I'd go a little further and say I generally have much more warm
and
fuzzy feelings when what is a vendor defined command (today) maps to more
or less the same bit of code for a proposed standards ECN.

IP rules prevent us commenting on specific proposals, but there will be
things we review quicker and with a lighter touch vs others where we
ask lots of annoying questions about generality of the feature etc.
Given the effort we are putting in on the kernel side we all want CXL
to succeed and will do our best to encourage activities that make that
more likely. There are other standards bodies available... which may
make more sense for some features.

Command interfaces are not a good place to compete and maintain secrecy.
If vendors want to do that, then they don't get the pony of upstream
support. They get to convince distros to do a custom kernel build for
them:
Good luck with that, some of those folk are 'blunt' in their responses to
such requests.

My proposal is we go forward with a bunch of the CXL spec defined commands
to show the 'how' and consider specific proposals for upstream support
of vendor defined commands on a case by case basis (so pretty much
what you say above). Maybe after a few are done we can formalize some
rules of thumb help vendors makes such proposals, though maybe some
will figure out it is a better and longer term solution to do 'standards
first development'.

I think we do need to look at the safety filtering of tunneled
commands but don't see that as a particularly tricky addition -
for the simple non destructive commands at least.

Jonathan

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux