On 10/5/22 10:21 AM, Matthew Rosato wrote: > On 10/5/22 10:01 AM, Jason Gunthorpe wrote: >> On Wed, Oct 05, 2022 at 10:57:28AM -0300, Jason Gunthorpe wrote: >>> On Wed, Oct 05, 2022 at 09:46:45AM -0400, Matthew Rosato wrote: >>> >>> >>>> (again, with the follow-up applied) Besides the panic above I just >>>> noticed there is also this warning that immediately precedes and is >>>> perhaps more useful. Re: what triggers the WARN, both group->owner >>>> and group->owner_cnt are already 0 >>> >>> And this is after the 2nd try that fixes the locking? >>> >>> This shows that vfio_group_detach_container() is called twice (which >>> was my guess), hoever this looks to be impossible as both calls are >>> protected by 'if (group->container)' and the function NULL's >>> group->container and it is all under the proper lock. >>> >>> My guess was that missing locking caused the two cases to race and >>> trigger WARN, but the locking should fix that. >>> >>> So I'm at a loss, can you investigate a bit? >> >> Huh, perhaps I'm loosing my mind, but I'm sure I sent this out, but it >> is not in the archive. This v2 fixes the missing locking and the rest >> of the remarks. > > Ah, here we go. OK, initial testing with vfio-pci on this version and I note that > > 1) the warning/crash is gone > 2) the iommu group ID no longer increments > > I next will take it through the longer series of tests that would crash before 'vfio: Follow a strict lifetime for struct iommu_group' but this looks good so far. > OK, this also looks good - thanks! Besides the vfio-pci testing on s390 I also ran some brief tests against both vfio-ccw and vfio-ap. Tested-by: Matthew Rosato <mjrosato@xxxxxxxxxxxxx>