On Thu, 7 Jun 2018 18:17:57 +0200 Halil Pasic <pasic@xxxxxxxxxxxxx> wrote: > On 06/07/2018 11:54 AM, Cornelia Huck wrote: > > Hm, I think we need to be more precise as to what scsw we're talking > > about. Bad ascii art time: > > > > -------------- > > | scsw(g) | ssch > > -------------- | > > | guest > > -------------------------------------------------------------- > > | qemu > > -------------- v > > | scsw(q) | emulate > > -------------- | > > | > > -------------- v > > | scsw(r) | pwrite() > > -------------- | > > | > > -------------------------------------------------------------- > > | vfio > > v > > ssch > > | > > -------------------------------------------------------------- > > | hardware > > -------------- v > > | scsw(h) | actually do something > > -------------- > > > > The guest issues a ssch (which gets intercepted; it won't get control > > back until ssch finishes with a cc set.) scsw(g) won't change, unless > > the guest does a stsch for the subchannel on another vcpu, in which > > case it will get whatever information qemu holds in scsw(q) at that > > point in time. > > (1) I think BQL make other cpu or not other kind of the same. We will > effectively start processing the stsch in QEMU after we are done > with the ssch in QEMU. Yeah, but my main point was that the change is in scsw(q) only. > > > > > When qemu starts to emulate the guest's ssch, it will set the start > > function bit in the fctl field of scsw(q). It then copies scsw(q) to > > scsw(r) in the vfio region. > > > > (2) This is architecturally wrong AFAIK. The fctl bit is supposed to be set on > cc 0. But because of (1) this might not be a observable by the guest -- > we can fix it up. The bit is set some time during the processing of the instruction - we need finite time to do the processing, but it should not be observable by the guest. We should not set the bit if we won't set cc 0. > > (3)IMHO scsw(r) is not a real scsw as defined by the architecture but > a strange communication structure (not) defined vfio-ccw. IIRC it was intended as a real scsw; we just did not want to define the whole structure as both Linux and QEMU have scsw definitions that map to the same hardware structure but look different. > > > The vfio code will then proceed to call ssch on the real subchannel. > > This is the first time we get really asynchronous, as the ssch will > > return with cc set and the start function will be performed at some > > point in time. If we would do a stsch on the real subchannel, we would > > see that scsw(h) now has the start function bit set. > > > > (4) I guess only if cc 0. Yes, obviously. > > > Currently, we won't return back up the chain until we get an interrupt > > from the hardware, at which time we update the scsw(r) from the irb. > > This will propagate into the scsw(q). At the time we finish handling > > the guest's ssch and return control to it, we're all done and if the > > guest does a stsch to update its scsw(g), it will get the current > > scsw(q) which will already contain the scsw from the interrupt's irb > > (indicating that the start function is already finished). > > > > Now let's imagine we have a future implementation that handles actually > > performing the start on the hardware asynchronously, i.e. it returns > > control to the guest without the interrupt having been posted (let's > > say that it is a longer-running I/O request). If the guest now did a > > stsch to update scsw(g), it would get the current state of scsw(q), > > which would be "start function set, but not done yet". > > (5) AFAIK this is how the current implementation works. We don't wait > for the I/O interrupt on the host to present a cc to the guest for it's > ssch. But the vfio code does wait, no? We just signal the interrupt via eventfd as well. > > > > > If the guest now does a hsch, it would trap in the same way as the ssch > > before. When qemu gets control, it adds the halt bit in scsw(q) (which > > is in accordance with the architecture). > > (7) Again it's when is fctl set according to the architecture... Same comment as above. If we do a hsch for a subchannel with the start function set, we'll set cc 0. > > > My proposal is to do the same > > copying to scsw(r) again, which would mean we get a request with both > > the halt and the start bit set. > > (8) IMHO when receiving the 'request' we are and should be in instruction > context -- opposed to basic io function context. So we should not set fctl > before we know what will our guest cc be. But since scsw(r) is not a real > scsw it is just strange. I think what we are doing is really 'performing the start function' - it's just not asynchronous in the current implementation. So we already know that ssch will return with cc 0. > > > The vfio code now needs to do a hsch > > (instead of a ssch). The real channel subsystem should figure this out, > > as we can't reliably check whether the start function has concluded > > already (there's always a race window). > > > > (9) Yes we can't tell for sure if the start function is still being performed > by the stuff below. We'll need to figure out a way to outsource most of those decisions to the real hardware. If we're not sure whether we can set cc 0, we should probably just set cc 2 and be done with it. (Serialization with regard to interrupts needed, of course.) > > Regards, > Halil Thanks for reading! > > > For csch, things are a bit different (which the code posted here did > > not take into account). The qemu emulation of csch needs to clear any > > start/halt bits in scsw(q) when setting the clear bit there, and > > therefore scsw(r) will only have the clear bit set in that case. We > > still should do an unconditional csch for the same reasons as above; > > the hardware will do the same things (clearing start/halt, setting > > clear) in the scsw(h). > > > > Congratulations, you've reached the end:) I hope that was helpful and > > not too confusing. > > > -- To unsubscribe from this list: send the line "unsubscribe linux-s390" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html