Am 18.12.24 um 07:16 schrieb Kasireddy, Vivek:
Hi Christian,
Subject: Re: [PATCH 0/4] cover-letter: Allow MMIO regions to be exported
through dmabuf
I will resend the patch series. I was experiencing issues with my email
client, which inadvertently split the series into two separate emails.
Alternatively I can also copy the code from the list archive and explain why
that doesn't work:
+void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool
+revoked) {
+ struct vfio_pci_dma_buf *priv;
+ struct vfio_pci_dma_buf *tmp;
+
+ lockdep_assert_held_write(&vdev->memory_lock);
This makes sure that the caller is holding vdev->memory_lock.
+
+ list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) {
+ if (!dma_buf_try_get(priv->dmabuf))
This here only works because vfio_pci_dma_buf_release() also grabs vdev-
memory_lock and so we are protected against concurrent releases.
+ continue;
+ if (priv->revoked != revoked) {
+ dma_resv_lock(priv->dmabuf->resv, NULL);
+ priv->revoked = revoked;
+ dma_buf_move_notify(priv->dmabuf);
+ dma_resv_unlock(priv->dmabuf->resv);
+ }
+ dma_buf_put(priv->dmabuf);
The problem is now that this here might drop the last reference which in turn
calls vfio_pci_dma_buf_release() which also tries to grab vdev-
memory_lock and so results in a deadlock.
AFAICS, vfio_pci_dma_buf_release() would not be called synchronously after the
last reference is dropped by dma_buf_put(). This is because fput(), which is called
by dma_buf_put() triggers f_op->release() asynchronously; therefore, a deadlock
is unlikely to occur in this scenario, unless I am overlooking something.
My recollection is that the f_op->release handler is only called
asynchronously if fput() was issued in interrupt context.
But could be that this information is outdated.
Regards,
Christian.
Thanks,
Vivek
+ }
+}
This pattern was suggested multiple times and I had to rejected it every time
because the whole idea is just fundamentally broken.
It's really astonishing how people always come up with the same broken
pattern.
Regards,
Christian.
Apart from that I have to reject the adding of
dma_buf_try_get(), that is clearly not something we should do.
Understood. It appears that Vivek has confirmed that his v2 has
resolved the issue. I will follow up with him to determine if he plans to
resume his patch, and if so, I will apply my last patch on top of his updated
patch series
Thanks,
Wei Lin
Thanks,
Christian.