On Wed, 30 Jun 2021 10:31:22 -0400 Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > On 6/28/21 4:29 PM, Halil Pasic wrote: > > On Fri, 25 Jun 2021 18:07:58 -0400 > > Tony Krowiak <akrowiak@xxxxxxxxxxxxx> wrote: > > > > What is a suitable base for this patch. I've tried the usual suspects, > > but none of them worked. > > I discovered what the problem is here. The patch is based on our > master branch along with the two pre-requisite patches that were > recently reviewed and are currently being merged. The two patches > of which I speak are: > * [PATCH v6 1/2] s390/vfio-ap: clean up mdev resources when remove > callback invoked > Message ID: <20210621155714.1198545-2-akrowiak@xxxxxxxxxxxxx> > > * [PATCH v6 2/2] s390/vfio-ap: r/w lock for PQAP interception handler > function pointer > <20210621155714.1198545-3-akrowiak@xxxxxxxxxxxxx> > > I probably should have included those along with this one. Either that, or state in the cover letter that those are prerequisites. > > > > >> The fix to resolve a lockdep splat while handling the > >> VFIO_GROUP_NOTIFY_SET_KVM event introduced a kvm_busy flag indicating that > >> the vfio_ap device driver is busy setting or unsetting the KVM pointer. > >> A wait queue was employed to allow functions requiring access to the KVM > >> pointer to wait for the kvm_busy flag to be cleared. For the duration of > >> the wait period, the mdev lock was unlocked then acquired again after the > >> kvm_busy flag was cleared. This got rid of the lockdep report, but didn't > >> really resolve the problem. > > Can you please elaborate on the last point. You mean that we can have > > circular locking even after 0cc00c8d4050, but instead of getting stuck in > > on a lock we will get stuck on wait_event_cmd()? If that is it, please > > state it clearly in the description, and if you can to it in the short > > description. > > This patch was in response to the following review comments made by Jason > Gunthorpe: > > * Message ID: <20210525162927.GC1002214@xxxxxxxxxx> > "... the kvm_busy should be replaced by a proper rwsem, > don't try to open code locks like that - it just defeats lockdep > analysis". > > * Message ID: <20210527112433.GX1002214@xxxxxxxxxx> > "Usually when people start open coding locks it is often > because lockdep complained. Open coding a lock makes > lockdep stop because the lockdep code > is removed, but it doesn't fix anything. The kvm_busy > should be replaced by a proper rwsem, don't try to > open code locks like that - it just defeats lockdep > analysis." > > I will paraphrase and include the information from Jason's > comments in the description. > This does not answer my questions. I'm in favor of Jason's proposal, because it is much easier to comprehend simple rwsem protected than a mutex + wait_queue dance. I think Jason was talking about open coding locks in general. I don't consider it as proof of commit 0cc00c8d4050 not doing what it advertised. You can add a Suggested-by tag if you like, but you should be able to tell us what is the merit of your patch. Regards, Halil