On 21/02/2024 15:18, Jason Gunthorpe wrote:
On Wed, Feb 21, 2024 at 08:44:31AM +0000, Zeng, Xin wrote:
On Wednesday, February 21, 2024 1:03 AM, Jason Gunthorpe wrote:
On Tue, Feb 20, 2024 at 03:53:08PM +0000, Zeng, Xin wrote:
On Tuesday, February 20, 2024 9:25 PM, Jason Gunthorpe wrote:
To: Zeng, Xin <xin.zeng@xxxxxxxxx>
Cc: Yishai Hadas <yishaih@xxxxxxxxxx>; herbert@xxxxxxxxxxxxxxxxxxx;
alex.williamson@xxxxxxxxxx; shameerali.kolothum.thodi@xxxxxxxxxx;
Tian,
Kevin <kevin.tian@xxxxxxxxx>; linux-crypto@xxxxxxxxxxxxxxx;
kvm@xxxxxxxxxxxxxxx; qat-linux <qat-linux@xxxxxxxxx>; Cao, Yahui
<yahui.cao@xxxxxxxxx>
Subject: Re: [PATCH 10/10] vfio/qat: Add vfio_pci driver for Intel QAT VF
devices
On Sat, Feb 17, 2024 at 04:20:20PM +0000, Zeng, Xin wrote:
Thanks for this information, but this flow is not clear to me why it
cause deadlock. From this flow, CPU0 is not waiting for any resource
held by CPU1, so after CPU0 releases mmap_lock, CPU1 can continue
to run. Am I missing something?
At some point it was calling copy_to_user() under the state
mutex. These days it doesn't.
copy_to_user() would nest the mm_lock under the state mutex which is
a
locking inversion.
So I wonder if we still have this problem now that the copy_to_user()
is not under the mutex?
In protocol v2, we still have the scenario in precopy_ioctl where
copy_to_user is
called under state_mutex.
Why? Does mlx5 do that? It looked Ok to me:
mlx5vf_state_mutex_unlock(mvdev);
if (copy_to_user((void __user *)arg, &info, minsz))
return -EFAULT;
Indeed, thanks, Jason. BTW, is there any reason why was "deferred_reset" mode
still implemented in mlx5 driver given this deadlock condition has been avoided
with migration protocol v2 implementation.
I do not remember. Yishai?
Long time passed.., I also don't fully remember whether this was the
only potential problem here, maybe Yes.
My plan is to prepare a cleanup patch for mlx5 and put it into our
regression for a while, if all will be good, I may send it for the next
kernel cycle.
By the way, there are other drivers around (i.e. hisi and mtty) that
still run copy_to_user() under the state mutex and might hit the problem
without the 'deferred_rest', see here[1].
If we'll reach to the conclusion that the only reason for that mechanism
was the copy_to_user() under the state_mutex, those drivers can change
their code easily and also send a patch to cleanup the 'deferred_reset'.
[1]
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c#L808
[2]
https://elixir.bootlin.com/linux/v6.8-rc5/source/samples/vfio-mdev/mtty.c#L878
Yishai