Re: [PATCH v2] drm/amdgpu: fix fdinfo race with process exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2021-08-04 5:04 a.m., Christian König wrote:
Sorry I'm on vacation and can't reply immediately.

This is the wrong approach. The fdinfo should have grabbed a reference to the fd it prints the info for.

So we should never race here. Can you double check how this happens?

This backtrace happened once, from /var/crash/..$date../vmcode-dmesg.log on the server machine, I can not repro the issue, grep app folder, there are python scripts accessing /proc/pid/node_id/fdinfo. This happened after app crash segmentation fault killed.

fdinfo grab fpriv reference, but not fpriv->vm.root.bo reference, I think this is needed, otherwise amdgpu_bo_reserve(fpriv->vm.root.bo) may deference NULL pointer.

Regards,

Philip

Thanks,
Christian.

Am 03.08.21 um 16:06 schrieb philip yang:

ping?

On 2021-07-29 10:13 p.m., Philip Yang wrote:
Get process vm root BO ref in case process is exiting and root BO is
freed, to avoid NULL pointer dereference backtrace:

BUG: unable to handle kernel NULL pointer dereference at
0000000000000000
Call Trace:
amdgpu_show_fdinfo+0xfe/0x2a0 [amdgpu]
seq_show+0x12c/0x180
seq_read+0x153/0x410
vfs_read+0x91/0x140[ 3427.206183]  ksys_read+0x4f/0xb0
do_syscall_64+0x5b/0x1a0
entry_SYSCALL_64_after_hwframe+0x65/0xca

v2: rebase to staging

Signed-off-by: Philip Yang<Philip.Yang@xxxxxxx>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 11 +++++++++--
  1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
index d94c5419ec25..5a6857c44bb6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c
@@ -59,6 +59,7 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
      uint64_t vram_mem = 0, gtt_mem = 0, cpu_mem = 0;
      struct drm_file *file = f->private_data;
      struct amdgpu_device *adev = drm_to_adev(file->minor->dev);
+    struct amdgpu_bo *root;
      int ret;
        ret = amdgpu_file_to_fpriv(f, &fpriv);
@@ -69,13 +70,19 @@ void amdgpu_show_fdinfo(struct seq_file *m, struct file *f)
      dev = PCI_SLOT(adev->pdev->devfn);
      fn = PCI_FUNC(adev->pdev->devfn);
  -    ret = amdgpu_bo_reserve(fpriv->vm.root.bo, false);
+    root = amdgpu_bo_ref(fpriv->vm.root.bo);
+    if (!root)
+        return;
+
+    ret = amdgpu_bo_reserve(root, false);
      if (ret) {
          DRM_ERROR("Fail to reserve bo\n");
          return;
      }
      amdgpu_vm_get_memory(&fpriv->vm, &vram_mem, &gtt_mem, &cpu_mem);
-    amdgpu_bo_unreserve(fpriv->vm.root.bo);
+    amdgpu_bo_unreserve(root);
+    amdgpu_bo_unref(&root);
+
      seq_printf(m, "pdev:\t%04x:%02x:%02x.%d\npasid:\t%u\n", domain, bus,
              dev, fn, fpriv->vm.pasid);
      seq_printf(m, "vram mem:\t%llu kB\n", vram_mem/1024UL);


[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux