On Fri, 2014-05-23 at 15:00 +1000, Benjamin Herrenschmidt wrote: > On Fri, 2014-05-23 at 14:37 +1000, Gavin Shan wrote: > > >There's no notification, the user needs to observe the return value an > > >poll? Should we be enabling an eventfd to notify the user of the state > > >change? > > > > > > > Yes. The user needs to monitor the return value. we should have one notification, > > but it's for later as we discussed :-) > > ../.. > > > >How does the guest learn about the error? Does it need to? > > > > When guest detects 0xFF's from reading PCI config space or IO, it's going > > check the device (PE) state. If the device (PE) has been put into frozen > > state, the recovery will be started. > > Quick recap for Alex W (we discussed that with Alex G). > > While a notification looks like a worthwhile addition in the long run, it > is not sufficient and not used today and I prefer that we keep that as something > to add later for those two main reasons: > > - First, the kernel itself isn't always notified. For example, if we implement > on top of an RTAS backend (PR KVM under pHyp) or if we are on top of PowerNV but > the error is a PHB "fence" (the entire PCI Host bridge gets fenced out in hardware > due to an internal error), then we get no notification. Only polling of the > hardware or firmware will tell us. Since we don't want to have a polling timer > in the kernel, that means that the userspace client of VFIO (or alternatively > the KVM guest) is the one that polls. > > - Second, this is how our primary user expects it: The primary (and only initial) > user of this will be qemu/KVM for PAPR guests and they don't have a notification > mechanism. Instead they query the EEH state after detecting an all 1's return from > MMIO or config space. This is how PAPR specifies it so we are just implementing the > spec here :-) > > Because of these, I think we shouldn't worry too much about notification at > this stage. Ok, I was asking more about an error log that indicates what error occurred to freeze the hardware so that the user can make a more educated guess whether recovery is an option. Given that you have cases where there may be no notification and your guest/user already handles this, the plan to start with polling makes sense. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html