On Fri, Oct 23, 2020 at 19:27:55 +0200, Igor Mammedov wrote: > On Fri, 23 Oct 2020 11:54:40 -0400 > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > On Fri, Oct 23, 2020 at 09:47:14AM +0300, Marcel Apfelbaum wrote: > > > Hi David, > > > > > > On Fri, Oct 23, 2020 at 6:49 AM David Gibson <dgibson@xxxxxxxxxx> wrote: > > > > > > On Thu, 22 Oct 2020 11:01:04 -0400 > > > "Michael S. Tsirkin" <mst@xxxxxxxxxx> wrote: > > > > > > > On Thu, Oct 22, 2020 at 05:50:51PM +0300, Marcel Apfelbaum wrote: > > > > [...] > > > > > > > > Right. After detecting just failing unconditionally it a bit too > > > > simplistic IMHO. > > > > > > There's also another factor here, which I thought I'd mentioned > > > already, but looks like I didn't: I think we're still missing some > > > details in what's going on. > > > > > > The premise for this patch is that plugging while the indicator is in > > > transition state is allowed to fail in any way on the guest side. I > > > don't think that's a reasonable interpretation, because it's unworkable > > > for physical hotplug. If the indicator starts blinking while you're in > > > the middle of shoving a card in, you'd be in trouble. > > > > > > So, what I'm assuming here is that while "don't plug while blinking" is > > > the instruction for the operator to obey as best they can, on the guest > > > side the rule has to be "start blinking, wait a while and by the time > > > you leave blinking state again, you can be confident any plugs or > > > unplugs have completed". Obviously still racy in the strict computer > > > science sense, but about the best you can do with slow humans in the > > > mix. > > > > > > So, qemu should of course endeavour to follow that rule as though it > > > was a human operator on a physical machine and not plug when the > > > indicator is blinking. *But* the qemu plug will in practice be fast > > > enough that if we're hitting real problems here, it suggests the guest > > > is still doing something wrong. > > > > > > > > > I personally think there is a little bit of over-engineering here. > > > Let's start with the spec: > > > > > >   Power Indicator Blinking > > >   A blinking Power Indicator indicates that the slot is powering up or > > > powering down and that > > >   insertion or removal of the adapter is not permitted. > > > > > > What exactly is an interpretation here? > > > As you stated, the races are theoretical, the whole point of the indicator > > > is to let the operator know he can't plug the device just yet. > > > > > > I understand it would be more user friendly if the QEMU would wait internally > > > for the > > > blinking to end, but the whole point of the indicator is to let the operator > > > (human or machine) > > > know they can't plug the device at a specific time. > > > Should QEMU take the responsibility of the operator? Is it even correct? > > > > > > Even if we would want such a feature, how is it related to this patch? > > > The patch simply refuses to start a hotplug operation when it knows it will not > > > succeed. > > >  > > > Another way that would make sense to me would be is a new QEMU interface other > > > than > > > "add_device", let's say "adding_device_allowed", that would return true if the > > > hotplug is allowed > > > at this point of time. (I am aware of the theoretical races) > > > > Rather than adding_device_allowed, something like "query slot" > > might be helpful for debugging. That would help user figure out > > e.g. why isn't device visible without any races. > > Would be new command useful tough? What we end up is broken guest > (if I read commit message right) and a user who has no idea if > device_add was successful or not. > So what user should do in this case > - wait till it explodes? > - can user remove it or it would be stuck there forever? > - poll slot before hotplug, manually? > > (if this is the case then failing device_add cleanly doesn't sound bad, > it looks similar to another error we have "/* Check if hot-plug is disabled on the slot */" > in pcie_cap_slot_pre_plug_cb) > > CCing libvirt, as it concerns not only QEMU. The only reason a separate command might make sense is if libvirt would want to provide a specific error to the user/upper management layer that the operation failed due to a transient failure (so that it can be retried later). We don't really want to go to a policy decision of how long to wait in such case, so unless qemu waits libvirt will plainly want to report an error. That said IMO 'device_add' should still fail if it's certain that the device won't be plugged in. This will fix any client which is currently in use. Adding a separate command is worth only for pre-checking for saner error handling. > > > The above will at least mimic the mechanics of the pyhs world. The operator > > > looks at the indicator, > > > the management software checks if adding the device is allowed. > > > Since it is a corner case I would prefer the device_add to fail rather than > > > introducing a new interface, > > > but that's just me. > > > > > > Thanks, > > > Marcel > > > > > > > I think we want QEMU management interface to be reasonably > > abstract and agnostic if possible. Pushing knowledge of hardware > > detail to management will just lead to pain IMHO. > > We supported device_add which practically never fails for years, > > For CPUs and RAM, device_add can fail so maybe management is also > prepared to handle errors on PCI hotplug path. While it was me who implemented device_add for cpu/memory I don't remmeber any more whether we are really prepared for it. We certainly want to do it if there's a possibility to do it.