On Thu, 6 Dec 2018 12:50:50 -0500 Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > On 12/06/2018 11:21 AM, Cornelia Huck wrote: > > On Thu, 6 Dec 2018 10:26:12 -0500 > > Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > > > >> On 12/06/2018 09:39 AM, Cornelia Huck wrote: > >>> On Wed, 5 Dec 2018 13:34:11 -0500 > >>> Farhan Ali <alifm@xxxxxxxxxxxxx> wrote: > >>> > >>>> On 12/05/2018 07:54 AM, Cornelia Huck wrote: > >>>>>> Yeah, that is perfectly clear, but it ain't the complete story. E.g. > >>>>>> are subsequent commands blocking until the preceding command finishes > >>>>>> is part of the interface. And what is good implementation depends on the > >>>>>> answer. What I mean, I first need to understand how things are supposed > >>>>>> to work (together) so I can double check that against the > >>>>>> implementation. Otherwise all I can do is nitpicking. > >>>>>> > >>>>>> To get more tangible: we are in the middle of processing an SSCH request > >>>>>> (e.g. doing the translation) when a HSCH comes in. What should happen? > >>>>>> Should we start processing HSCH after he instruction part of SSCH is > >>>>>> done -- which currently includes translation? Or should we -EBUSY? Or do > >>>>>> we abort START related activities and do the HALT stuff? > >>>>> I think most of the sorting-out-the-operations stuff should be done by > >>>>> the hardware itself, and we should not really try to enforce anything > >>>>> special in our vfio code. > >>>>> > >>>>> For your example, it might be best if a hsch is always accepted and > >>>>> send on towards the hardware. Probably best to reflect back -EAGAIN if > >>>>> we're currently processing another instruction from another vcpu, so > >>>>> that the user space caller can retry. Same for ssch, if another ssch is > >>>>> already being processed. We*could* reflect cc 2 if the fctl bit is > >>>>> already set, but that won't do for csch, so it is probably best to have > >>>>> the hardware figure that out in any case. > >>>>> > >>>>> If I read the code correctly, we currently reflect -EBUSY and not > >>>>> -EAGAIN if we get a ssch request while already processing another one. > >>>>> QEMU hands that back to the guest as a cc 2, which is not 100% correct. > >>>>> In practice, we don't see this with Linux guests due to locking. > >>>>> > >>>> > >>>> If we have a ssch and a csch immediately afterwards from userspace, will > >>>> we end up issuing csch first and then ssch to the hardware? > >>>> > >>>> If I understand correctly, the ccw translation as part of the ssch can > >>>> be a slow operation so it might be possible we issue the csch first? > >>>> In that case we won't actually clear the original start function as > >>>> intended. > >>> > >>> When we start processing the ssch request (translation and so on), we > >>> set the state to BUSY. This means that any csch request will get a > >>> -EBUSY, no overtaking possible. (I think maybe I'll need to check what > >>> this series looks like if I rebase it on top of Pierre's rework, as he > >>> did some changes in the state machine.) > >> > >> I think you meant the state is set to BOXED? otherwise the patch 3 says > >> if state is BUSY and CLEAR event request comes in, we issue the clear > >> instruction, no? > > > > That's what I meant with "need to rebase" :) The BOXED state is gone; I > > just had not rebased on top of it. There's more changes in the state > > machine; if we are on the same page as to what should happen, I can > > start massaging the patches. > > > > > > Sorry maybe I missed it, but are you referring to Pierre's latest > cleanup patches? I don't see him removing the BOXED state. The "remove BOXED state" patch is currently on my vfio-ccw-staging branch. (That reminds me, will need to move it to my vfio-ccw branch and possibly send a pull request. I had hoped to collect more patches for the next release...) > > I think returning -EAGAIN and asking the userspace to retry the > operation sounds reasonable to me. > > But how do we handle the issue of protecting the cmd_region from > simultaneous hsch and csch calls? Do we agree on Pierre's method of > making write calls mutually exclusive? That's in his patch series, right? I did not yet have time to look at it...