On Tue, Feb 20, 2024 at 03:53:08PM +0000, Zeng, Xin wrote: > On Tuesday, February 20, 2024 9:25 PM, Jason Gunthorpe wrote: > > To: Zeng, Xin <xin.zeng@xxxxxxxxx> > > Cc: Yishai Hadas <yishaih@xxxxxxxxxx>; herbert@xxxxxxxxxxxxxxxxxxx; > > alex.williamson@xxxxxxxxxx; shameerali.kolothum.thodi@xxxxxxxxxx; Tian, > > Kevin <kevin.tian@xxxxxxxxx>; linux-crypto@xxxxxxxxxxxxxxx; > > kvm@xxxxxxxxxxxxxxx; qat-linux <qat-linux@xxxxxxxxx>; Cao, Yahui > > <yahui.cao@xxxxxxxxx> > > Subject: Re: [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF devices > > > > On Sat, Feb 17, 2024 at 04:20:20PM +0000, Zeng, Xin wrote: > > > > > Thanks for this information, but this flow is not clear to me why it > > > cause deadlock. From this flow, CPU0 is not waiting for any resource > > > held by CPU1, so after CPU0 releases mmap_lock, CPU1 can continue > > > to run. Am I missing something? > > > > At some point it was calling copy_to_user() under the state > > mutex. These days it doesn't. > > > > copy_to_user() would nest the mm_lock under the state mutex which is a > > locking inversion. > > > > So I wonder if we still have this problem now that the copy_to_user() > > is not under the mutex? > > In protocol v2, we still have the scenario in precopy_ioctl where copy_to_user is > called under state_mutex. Why? Does mlx5 do that? It looked Ok to me: mlx5vf_state_mutex_unlock(mvdev); if (copy_to_user((void __user *)arg, &info, minsz)) return -EFAULT; Jason