Jonathan Cameron wrote: > On Wed, 22 Jun 2022 17:25:02 -0700 > Ira Weiny <ira.weiny@xxxxxxxxx> wrote: > > > On Wed, Jun 22, 2022 at 03:57:34PM -0700, Dan Williams wrote: > > > Ira Weiny wrote: > > > > On Fri, Jun 17, 2022 at 03:56:38PM -0700, Dan Williams wrote: > > > [..] > > > > > > +static int pci_doe_discovery(struct pci_doe_mb *doe_mb, u8 *index, u16 *vid, > > > > > > + u8 *protocol) > > > > > > +{ > > > > > > + u32 request_pl = FIELD_PREP(PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX, > > > > > > + *index); > > > > > > + u32 response_pl; > > > > > > + DECLARE_COMPLETION_ONSTACK(c); > > > > > > + struct pci_doe_task task = { > > > > > > + .prot.vid = PCI_VENDOR_ID_PCI_SIG, > > > > > > + .prot.type = PCI_DOE_PROTOCOL_DISCOVERY, > > > > > > + .request_pl = &request_pl, > > > > > > + .request_pl_sz = sizeof(request_pl), > > > > > > + .response_pl = &response_pl, > > > > > > + .response_pl_sz = sizeof(response_pl), > > > > > > + .complete = pci_doe_task_complete, > > > > > > + .private = &c, > > > > > > + }; > > > > > > + int ret; > > > > > > + > > > > > > + ret = pci_doe_submit_task(doe_mb, &task); > > > > > > + if (ret < 0) > > > > > > + return ret; > > > > > > + > > > > > > + wait_for_completion(&c); > > > > > > > > > > Another place where the need for a completion can be replaced with > > > > > flush_work(). > > > > > > > > No not here. While this call is internal it is actually acting like an > > > > external caller. This specific wait is for that response to get back. > > > > > > > > This pattern was specifically asked for by you. Previously Jonathan had a > > > > synchronous call which took care of this but you said let all callers just > > > > handle it themselves. So all callers submit a task and if they want to wait > > > > for the response they have to do so themselves. > > > > > > Ah, true I remember that. The nice thing about a doing your own > > > wait_for_completion() like this is that you can make it > > > wait_for_completion_interruptible() to give up on the DOE if it gets > > > stalled. However, if you have a work item per-task and you're willing to > > > do an uninterruptible sleep, then flush_work(&task->work) is identical. > > > > So when you mentioned a work item per task I really jumped on that idea. But I > > realize now that it is a bit more complicated than that. > > > > Currently a work item is actually one step of the state machine. The state > > machine queues the next step of work as a new work item. > > > > I'm going to have to change the state machine quite a bit. I still agree with > > the one work item per task but it is going to take a bit of work to get the > > state machine to operate within that single task. > > > > I don't like what might result if I layer a work queue on top of using the > > system work queue for the individual steps of the state machine. So stay > > tuned. > > Yup. I went through that (between RFC v1 and RFC v2) and it wasn't pretty > - maybe it's worth a revisit though. > > To throw another view point in the mix. Note that I want a solution and > in my view DOE is slow and never on a fast path + I don't see it being > high churn code so needs to be fairly maintainable but not super simple > or architecturally clean (at the level of state machines / work queues etc > - interfaces need to be clean!) > > If we go back to RFC v1, which IIRC was basically queue on a mutex, and > consider it in the light of where we've ended up. I wussed out on arguing much > about this at the time because consensus + moving forward was more > important to me than the chosen architecture. > > Taking a slightly black and white view of requirements. I don't think > we loose anything by using this list... > > 1. Synchronous (if anyone needs async at level of caller, they can spin > a thread up). Async is the corner case, not the common one. > 2. Small number (< 3 I'm guessing) of protocols per instance. > 3. Very rare there is significant contention. Fairness doesn't matter. > Normally the only reason we'd get contention is userspace triggering > access to multiple protocols at a time - probably via sysfs or other slow > method. > 4. Per protocol ordering can be maintained by the protocol, not the DOE layer. > 5. DOE is basically a bus over which we are talking to different devices > - think of it like I2C but rather than address we have protocol IDs. > > That last analogy brings us back to how I think almost all slow buses are > handled. > > At level of a bus, a lock is used for mutual exclusion (often also protecting > bus controller register state etc). No workqueues or similar complexity > - Underlying hardware typically doing DMA of result into a provided buffer > with only one transaction in flight at that layer at anyone time. > Note there is more complex handling for high perf cases, but in many cases > its not really used. > > We have a bus lock that can ensure exclusion over sequences if necessary > (there's one in SPI). > > If a given driver needs to ensure exclusion for RMW or similar sequences > of operations it takes a driver specific mutex and holds it across these > sequences of slow operations, which usually sleep, include interrupts and all > sorts of fun. Normally there is a completion in there somewhere to > get from the 'done' interrupt on the bus controller back to the > i2c_smbus_bus_read() etc that is waiting on the result. > > This model works, is super simple and layered. > > In case of no contention (perhaps 99% of time), it immediately runs the > DOE/bus access in the thread that made the read/write request. > So no overhead of going to a workqueue. > > Implementation is one linear function, no state machine needed. Agree on this principle, once you have one linear function then it does not matter if that function is under a mutex() or is run in an ordered workqueue. If the one linear function with a mutex ends up needing to invent its own waitqueues and event completion notifications then maybe its better to just use a workqueue than reinvent those wheels, but if push comes to shove that's just a nice to have if we get to the one linear function implementation. > Anyhow, to refer back to my initial comment. I'm not that fussed on how > we do this but it's a blocker on other work so quick solution is more > important to me than perfect one. Agree.