On Fri, 25 Jan 2019 10:57:38 -0500 Eric Farman <farman@xxxxxxxxxxxxx> wrote: > On 01/25/2019 07:58 AM, Cornelia Huck wrote: > > On Fri, 25 Jan 2019 11:24:37 +0100 > > Cornelia Huck <cohuck@xxxxxxxxxx> wrote: > > > >> On Thu, 24 Jan 2019 21:37:44 -0500 > >> Eric Farman <farman@xxxxxxxxxxxxx> wrote: > >> > >>> On 01/24/2019 09:25 PM, Eric Farman wrote: > >>>> > >>>> > >>>> On 01/21/2019 06:03 AM, Cornelia Huck wrote: > >> > >>>> [1] I think these changes are cool. We end up going into (and staying > >>>> in) state=BUSY if we get cc=0 on the SSCH, rather than in/out as we > >>>> bumble along. > >>>> > >>>> But why can't these be separated out from this patch? It does change > >>>> the behavior of the state machine, and seem distinct from the addition > >>>> of the mutex you otherwise add here? At the very least, this behavior > >>>> change should be documented in the commit since it's otherwise lost in > >>>> the mutex/EAGAIN stuff. > >> > >> That's a very good idea. I'll factor them out into a separate patch. > > > > And now that I've factored it out, I noticed some more problems. > > That's good! Maybe it helps us with the circles we're on :) :) > > > > > What we basically need is the following, I think: > > > > - The code should not be interrupted while we process the channel > > program, do the ssch etc. We want the caller to try again later (i.e. > > return -EAGAIN) > > - We currently do not want the user space to submit another channel > > program while the first one is still in flight. > > These two seem to contradict one another. I think you're saying is that > we don't _want_ userspace to issue another channel program, even though > its _allowed_ to as far as vfio-ccw is concerned. What I'm trying to say is that we want to distinguish two things: - The code is currently doing translation etc. We probably want to keep that atomic, in order not to make things too complicated. - We have sent the ssch() to the hardware, but have not yet received the final interrupt for that request (that's what I meant with "in flight"). It's easier for the first shot to disallow a second ssch() as that would need handling of more than one cp request, but we may want to allow it in the future. A hsch()/csch() (which does not generate a new cp) should be fine. (see also my reply to Halil's mail) > > As submitting another > > one is a valid request, however, we should allow this in the future > > (once we have the code to handle that in place). > > - With the async interface, we want user space to be able to submit a > > halt/clear while a start request is still in flight, but not while > > we're processing a start request with translation etc. We probably > > want to do -EAGAIN in that case. > > > > My idea would be: > > > > - The BUSY state denotes "I'm busy processing a request right now, try > > again". We hold it while processing the cp and doing the ssch and > > leave it afterwards (i.e., while the start request is processed by > > the hardware). I/O requests and async requests get -EAGAIN in that > > state. > > - A new state (CP_PENDING?) is entered after ssch returned with cc 0 > > (from the BUSY state). We stay in there as long as no final state for > > that request has been received and delivered. (This may be final > > interrupt for that request, a deferred cc, or successful halt/clear.) > > I/O requests get -EBUSY > > I liked CP_PENDING, since it corresponds to the subchannel being marked > "start pending" as described in POPS, but this statement suggests that > the BUSY/PENDING state to be swapped, such that state=PENDING returns > -EAGAIN and state=BUSY returns -EBUSY. Not super-concerned with the > terminology though. What about s/BUSY/CP_PROCESSING/ ? > > , async requests are processed. This state can > > be removed again once we are able to handle more than one outstanding > > cp. > > > > Does that make sense? > > > > I think so, and I think I like it. So you want to distinguish between > (I have swapped BUSY/PENDING in this example per my above comment): > > A) SSCH issued by userspace (IDLE->PENDING) > B) SSCH issued (successfully) by kernel (PENDING->BUSY) > B') SSCH issued (unsuccessfully) by kernel (PENDING->IDLE?) I think so. > C) Interrupt received by kernel (no change?) > D) Interrupt given to userspace (BUSY->IDLE) Only if that is the final interrupt for that cp. > > If we receive A and A, the second A gets EAGAIN > > If we do A+B and A, the second A gets EBUSY (unless async, which is > processed) Nod. > Does the boundary of "in flight" in the interrupt side (C and D) need to > be defined, such that we go BUSY->PENDING->IDLE instead of BUSY->IDLE ? I don't think we can go BUSY->PENDING (in your terminology), at that would imply a retry of the ssch()?