Re: [PATCH V11 3/8] PCI: Create PCI library functions in support of DOE mailboxes.

Dan Williams <dan.j.williams@xxxxxxxxx> · Thu, 23 Jun 2022 11:07:58 -0700

Ira Weiny wrote:
> On Wed, Jun 22, 2022 at 03:57:34PM -0700, Dan Williams wrote:
> > Ira Weiny wrote:
> > > On Fri, Jun 17, 2022 at 03:56:38PM -0700, Dan Williams wrote:
> > [..]
> > > > > +static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 *index, u16 *vid,
> > > > > +			     u8 *protocol)
> > > > > +{
> > > > > +	u32 request_pl = FIELD_PREP(PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX,
> > > > > +				    *index);
> > > > > +	u32 response_pl;
> > > > > +	DECLARE_COMPLETION_ONSTACK(c);
> > > > > +	struct pci_doe_task task = {
> > > > > +		.prot.vid = PCI_VENDOR_ID_PCI_SIG,
> > > > > +		.prot.type = PCI_DOE_PROTOCOL_DISCOVERY,
> > > > > +		.request_pl = &request_pl,
> > > > > +		.request_pl_sz = sizeof(request_pl),
> > > > > +		.response_pl = &response_pl,
> > > > > +		.response_pl_sz = sizeof(response_pl),
> > > > > +		.complete = pci_doe_task_complete,
> > > > > +		.private = &c,
> > > > > +	};
> > > > > +	int ret;
> > > > > +
> > > > > +	ret = pci_doe_submit_task(doe_mb, &task);
> > > > > +	if (ret < 0)
> > > > > +		return ret;
> > > > > +
> > > > > +	wait_for_completion(&c);
> > > > 
> > > > Another place where the need for a completion can be replaced with
> > > > flush_work().
> > > 
> > > No not here.  While this call is internal it is actually acting like an
> > > external caller.  This specific wait is for that response to get back.
> > > 
> > > This pattern was specifically asked for by you.  Previously Jonathan had a
> > > synchronous call which took care of this but you said let all callers just
> > > handle it themselves.  So all callers submit a task and if they want to wait
> > > for the response they have to do so themselves.
> > 
> > Ah, true I remember that. The nice thing about a doing your own
> > wait_for_completion() like this is that you can make it
> > wait_for_completion_interruptible() to give up on the DOE if it gets
> > stalled. However, if you have a work item per-task and you're willing to
> > do an uninterruptible sleep, then flush_work(&task->work) is identical.
> 
> So when you mentioned a work item per task I really jumped on that idea.  But I
> realize now that it is a bit more complicated than that.
> 
> Currently a work item is actually one step of the state machine.  The state
> machine queues the next step of work as a new work item.
> 
> I'm going to have to change the state machine quite a bit.  I still agree with
> the one work item per task but it is going to take a bit of work to get the
> state machine to operate within that single task.
> 
> I don't like what might result if I layer a work queue on top of using the
> system work queue for the individual steps of the state machine.  So stay
> tuned.

In the end only one workqueue should exist either a task queue (my first
preference) or a device-state queue (if the task queue turns out not to
fit), but neither of those use cases should be glomming onto the
unbounded system_wq. Keep it simple with a dedicated ordered queue.