Re: [RFC PATCH v3 2/4] PCI/doe: Add Data Object Exchange support

Jonathan Cameron <Jonathan.Cameron@xxxxxxxxxx> · Fri, 14 May 2021 13:39:42 +0100

On Fri, 14 May 2021 12:15:38 +0100
Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> wrote:

> On Fri, May 14, 2021 at 09:47:55AM +0100, Jonathan Cameron wrote:
> > On Thu, 13 May 2021 14:20:38 -0700
> > Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> >   
> > > On Tue, May 11, 2021 at 9:52 AM Jonathan Cameron
> > > <Jonathan.Cameron@xxxxxxxxxx> wrote:  
> > > >
> > > > On Thu, 6 May 2021 14:59:34 -0700
> > > > Ira Weiny <ira.weiny@xxxxxxxxx> wrote:
> > > >    
> > > > > On Tue, Apr 20, 2021 at 12:54:49AM +0800, Jonathan Cameron wrote:    
> > > > > > +
> > > > > > +static int pci_doe_send_req(struct pci_doe *doe, struct pci_doe_exchange *ex)
> > > > > > +{
> > > > > > +   struct pci_dev *pdev = doe->pdev;
> > > > > > +   u32 val;
> > > > > > +   int i;
> > > > > > +
> > > > > > +   /*
> > > > > > +    * Check the DOE busy bit is not set. If it is set, this could indicate
> > > > > > +    * someone other than Linux (e.g. firmware) is using the mailbox. Note
> > > > > > +    * it is expected that firmware and OS will negotiate access rights via
> > > > > > +    * an, as yet to be defined method.
> > > > > > +    */
> > > > > > +   pci_read_config_dword(pdev, doe->cap + PCI_DOE_STATUS, &val);
> > > > > > +   if (FIELD_GET(PCI_DOE_STATUS_BUSY, val))
> > > > > > +           return -EBUSY;    
> > > > >
> > > > > In discussion with Dan we believe that user space could also be issuing
> > > > > commands and would potentially cause us to be locked out.
> > > > >
> > > > > We agree that firmware should be out of the way here and if it is blocking
> > > > > the OS there is not much we can do about it.
> > > > >
> > > > > However, if user space is using the mailbox we need to synchronize with them
> > > > > via pci_cfg_access_[try]lock().  This should avoid this EBUSY condition.    
> > > >
> > > > Hi Ira, thanks for taking a look.
> > > >
> > > > So the question here is whether we can ever safely work with a
> > > > userspace that is accessing the DOE.  I think the answer is no we can't.
> > > >
> > > > We'd have no way of knowing that userspace left the DOE in a clean state
> > > > without resetting every time we want to use it (which can take 1 second)
> > > > or doing significant sanity checking (can we tell if something is
> > > > in flight?).  Note that if userspace and kernel were talking different
> > > > protocols nothing sensible could be done to prevent them receiving each
> > > > other's answers (unless you can rely on userspace holding the lock until
> > > > it is done - which you can't as who trusts userspace?)    
> > > 
> > > There is no ability for userpsace to lock out the kernel, only kernel
> > > locking out userspace.  
> > 
> > Hi Dan,
> > 
> > Got it. Writing userspace to code with arbitrary kernel
> > breakage of exchanges userspace initialized is going to be nasty. 
> >   
> > >   
> > > > You could do
> > > > something horrible like back off after peeking at the protocol to see
> > > > if it might be yours, but even that only works assuming the two are
> > > > trying to talk different protocols (talking the same protocol isn't allowed
> > > > but no way to enforce that using just pci_cfg_access_lock()).    
> > > 
> > > Wait why isn't pci_cfg_access_lock() sufficient? The userspace DOE
> > > transfer is halted, the kernel validates the state of DOE, does it's
> > > work and releases the lock.  
> > 
> > It's that 'validate the state of the DOE' which is the problem.  I 'think'
> > the only way to do that is to issue an abort every time and I'm really
> > not liking the fact that adds a potential 1 second sleep to every
> > DOE access from the kernel.  
> 
> IIUC an abort would mean game over for *every* transaction in flight,
> not sure that's the best way of preventing userspace from mingling
> but as you mentioned I don't think there is a way around it with the
> current protocol.
> 
> I don't see how a lock would solve this issue either - how would it ?

It would only work if symmetric (can be locked by userspace or kernel space)
and userspace was well behaved.

> 
> Basically you have to stop userspace from issuing requests for the
> duration of a request/response (per-protocol) session, right ?

Yes, but also stop the kernel from issuing requests for the duration of
a userspace request/response.  Imagine userspace issues a request for
a particular part of CDAT then kernel issues it's own request for a different
part of CDAT. Unless we can ensure the userspace request / response pair
is done, then there is no way for the kernel to verify that the response
it gets is to the request it made.  You could do some protocol specific
validation but that is horrible and you might need to read a bunch of
additional records to do it (checksum).

As currently implemented, I'm not allowing concurrent exchanges from
different protocols because it is very fiddly to do (handling the busy flag)
and I'm not convinced there is a real usecase.  Whilst in theory you can
have lots of protocols on one DOE, the protocols defined so far have had
a bunch or restrictions that mean at most you can currently collocate 2
protocols only.  If it turns out to be necessary at some future time, then
we can look at adding it.

I think the right way to do this is to put a proper interface in place
to expose the functionality to userspace via a path where mediation can
be easily handled.

Jonathan

> 
> Lorenzo