On Wed, 2011-08-24 at 11:09 +0200, Jan Kiszka wrote: > On 2011-08-24 00:05, Alex Williamson wrote: > > On Tue, 2011-08-23 at 15:31 +0200, Jan Kiszka wrote: > >> Hi Alex, > >> > >> just ran into some corner case with my reanimated IRQ sharing patches > >> that may affect vfio as well: > >> > >> How are vfio_enable/disable_intx synchronized against all other possible > >> spots that call pci_block_user_cfg_access? > >> > >> I hit the recursion bug check in pci_block_user_cfg_access with my code > >> which takes the user_cfg lock like vfio does. It likely races with > >> pci_reset_function here - and should do so in vfio as well. > > > > So the race is that we're doing a pci_reset_function and while we've got > > pci_block_user_cfg_access set, an interrupt comes in (maybe from a > > device sharing the interrupt line), and we hit the BUG_ON when trying to > > nest pci_block_user_cfg_access? > > Most probably the scenario I was seeing, but I didn't debugged it in all > details as it already locked up my notebook twice. > > > > >> Just taking some lock would mean having to run pci_reset_function with > >> IRQs disabled to synchronize with the IRQ handler (not sure if that is > >> possible at all). Alternatively, we would have to disable the interrupt > >> line or deregister the IRQ while resetting. Or we perform INTx mask > >> manipulation in an unsynchronized fashion, resolving races with user > >> space differently (still need to think about this option). > >> > >> Any other thoughts? > > > > I think this is a bit easier for vfio since the reset is already routed > > through a vfio ioctl. We can just use a mutex between the two, reset > > would wait on the mutex while the interrupt handler would skip masking > > of a shared interrupt if it can't get the mutex (hopefully the interrupt > > is really for a shared device or we squelch it via the reset before we > > trigger the spurious interrupt counter). > > > > I think the only path for kvm assignment that doesn't involve also > > rerouting the reset through a kvm ioctl would have to be avoiding the > > problem in userspace. We'd have to unregister the interrupt handler, > > reset, then re-register. That sounds pretty heavy, but the reset is > > already a slow process. Thanks, > > I don't think we can allow userspace to potentially crash the host. > > Anyway, this problem turns out to be way more generic. Just run two > "echo 1 > /sys/bus/pci/.../reset" loops on the same device in parallel. > But be warned, you will have to reboot that box afterward. > > Maybe this very creative interface of pci_block_user_cfg_access was once > OK when only the IPR SCSI driver used it. But by the time it grew beyond > that use case, it became hopelessly broken (well, open-coded > locking...). We need to redesign it, synchronizing users that can sleep > via a simple mutex and addressing access to the status/command word > separately via an IRQ-save spinlock (as we need that service in hard IRQ > handlers). Yep, that sounds like the best path. pci_block_user_cfg_access is at best "fragile" in it's current implementation. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html