On 2011-08-24 00:05, Alex Williamson wrote: > On Tue, 2011-08-23 at 15:31 +0200, Jan Kiszka wrote: >> Hi Alex, >> >> just ran into some corner case with my reanimated IRQ sharing patches >> that may affect vfio as well: >> >> How are vfio_enable/disable_intx synchronized against all other possible >> spots that call pci_block_user_cfg_access? >> >> I hit the recursion bug check in pci_block_user_cfg_access with my code >> which takes the user_cfg lock like vfio does. It likely races with >> pci_reset_function here - and should do so in vfio as well. > > So the race is that we're doing a pci_reset_function and while we've got > pci_block_user_cfg_access set, an interrupt comes in (maybe from a > device sharing the interrupt line), and we hit the BUG_ON when trying to > nest pci_block_user_cfg_access? Most probably the scenario I was seeing, but I didn't debugged it in all details as it already locked up my notebook twice. > >> Just taking some lock would mean having to run pci_reset_function with >> IRQs disabled to synchronize with the IRQ handler (not sure if that is >> possible at all). Alternatively, we would have to disable the interrupt >> line or deregister the IRQ while resetting. Or we perform INTx mask >> manipulation in an unsynchronized fashion, resolving races with user >> space differently (still need to think about this option). >> >> Any other thoughts? > > I think this is a bit easier for vfio since the reset is already routed > through a vfio ioctl. We can just use a mutex between the two, reset > would wait on the mutex while the interrupt handler would skip masking > of a shared interrupt if it can't get the mutex (hopefully the interrupt > is really for a shared device or we squelch it via the reset before we > trigger the spurious interrupt counter). > > I think the only path for kvm assignment that doesn't involve also > rerouting the reset through a kvm ioctl would have to be avoiding the > problem in userspace. We'd have to unregister the interrupt handler, > reset, then re-register. That sounds pretty heavy, but the reset is > already a slow process. Thanks, I don't think we can allow userspace to potentially crash the host. Anyway, this problem turns out to be way more generic. Just run two "echo 1 > /sys/bus/pci/.../reset" loops on the same device in parallel. But be warned, you will have to reboot that box afterward. Maybe this very creative interface of pci_block_user_cfg_access was once OK when only the IPR SCSI driver used it. But by the time it grew beyond that use case, it became hopelessly broken (well, open-coded locking...). We need to redesign it, synchronizing users that can sleep via a simple mutex and addressing access to the status/command word separately via an IRQ-save spinlock (as we need that service in hard IRQ handlers). Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html