Re: nouveau broken on Riva TNT2 in 5.13.0-rc4: NULL pointer dereference in nouveau_bo_sync_for_device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Am 10.06.21 um 19:59 schrieb Christian König:
Am 10.06.21 um 19:50 schrieb Ondrej Zary:
[SNIP]
I can't see how this is called from the nouveau code, only possibility I
see is that it is maybe called through the AGP code somehow.
Yes, you're right:
[   13.192663] Call Trace:
[   13.192678]  dump_stack+0x54/0x68
[   13.192690]  ttm_tt_init+0x11/0x8a [ttm]
[   13.192699]  ttm_agp_tt_create+0x39/0x51 [ttm]
[   13.192840]  nouveau_ttm_tt_create+0x17/0x22 [nouveau]
[   13.192856]  ttm_tt_create+0x78/0x8c [ttm]
[   13.192864]  ttm_bo_handle_move_mem+0x7d/0xca [ttm]
[   13.192873]  ttm_bo_validate+0x92/0xc8 [ttm]
[   13.192883]  ttm_bo_init_reserved+0x216/0x243 [ttm]
[   13.192892]  ttm_bo_init+0x45/0x65 [ttm]
[   13.193018]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
[   13.193150]  nouveau_bo_init+0x8c/0x94 [nouveau]
[   13.193273]  ? nouveau_bo_del_io_reserve_lru+0x48/0x48 [nouveau]
[   13.193407]  nouveau_bo_new+0x44/0x57 [nouveau]
[   13.193537]  nouveau_channel_prep+0xa3/0x269 [nouveau]
[   13.193665]  nouveau_channel_new+0x3c/0x5f7 [nouveau]
[   13.193679]  ? slab_free_freelist_hook+0x3b/0xa7
[   13.193686]  ? kfree+0x9e/0x11a
[   13.193781]  ? nvif_object_sclass_put+0xd/0x16 [nouveau]
[   13.193908]  nouveau_drm_device_init+0x2e2/0x646 [nouveau]
[   13.193924]  ? pci_enable_device_flags+0x1e/0xac
[   13.194052]  nouveau_drm_probe+0xeb/0x188 [nouveau]
[   13.194182]  ? nouveau_drm_device_init+0x646/0x646 [nouveau]
[   13.194195]  pci_device_probe+0x89/0xe9
[   13.194205]  really_probe+0x127/0x2a7
[   13.194212]  driver_probe_device+0x5b/0x87
[   13.194219]  device_driver_attach+0x2e/0x41
[   13.194226]  __driver_attach+0x7c/0x83
[   13.194232]  bus_for_each_dev+0x4c/0x66
[   13.194238]  driver_attach+0x14/0x16
[   13.194244]  ? device_driver_attach+0x41/0x41
[   13.194251]  bus_add_driver+0xc5/0x16c
[   13.194258]  driver_register+0x87/0xb9
[   13.194265]  __pci_register_driver+0x38/0x3b
[   13.194271]  ? 0xf0c0d000
[   13.194362]  nouveau_drm_init+0x14c/0x1000 [nouveau]

How is ttm_dma_tt->dma_address allocated?

Mhm, I need to double check how AGP is supposed to work.

Since barely anybody is using it these days it is something which breaks from time to time.

I have no idea how that ever worked in the first place since AGP isn't supposed to sync between CPU/GPU. Everything is coherent for that case.

Anyway here is a patch which adds a check to those functions if the dma_address array is allocated in the first place. Please test it.

Thanks,
Christian.


Thanks for the backtrace,
Christian.

  I cannot find any assignment
executed (in the working code):

$ git grep dma_address\ = drivers/gpu/
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c: sg->sgl->dma_address = addr; drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c: dma_address = &dma->dma_address[offset >> PAGE_SHIFT]; drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c: dma_address = (mm_node->start << PAGE_SHIFT) + offset;
drivers/gpu/drm/i915/gvt/scheduler.c:   sg->dma_address = addr;
drivers/gpu/drm/i915/i915_gpu_error.c:  sg->dma_address = it;
drivers/gpu/drm/ttm/ttm_tt.c:   ttm->dma_address = (void *) (ttm->ttm.pages + ttm->ttm.num_pages); drivers/gpu/drm/ttm/ttm_tt.c:   ttm->dma_address = kvmalloc_array(ttm->ttm.num_pages,
drivers/gpu/drm/ttm/ttm_tt.c:   ttm_dma->dma_address = NULL;
drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address = &__vmw_piter_phys_addr; drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address = &__vmw_piter_dma_addr; drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c: viter->dma_address = &__vmw_piter_sg_addr;

The 2 cases in ttm_tt.c are in ttm_dma_tt_alloc_page_directory() and
ttm_sg_tt_alloc_page_directory().
Confirmed by adding printk()s that they're NOT called.




>From 5370102729c6ecb280712c40b92ff7b9f58c6e1e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@xxxxxxx>
Date: Fri, 11 Jun 2021 14:34:50 +0200
Subject: [PATCH] drm/nouveau: check dma_address array for CPU/GPU sync
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

AGP for example doesn't have a dma_address array.

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
---
 drivers/gpu/drm/nouveau/nouveau_bo.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 085023624fb0..1a52590f5303 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -551,7 +551,7 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
 	struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
 	int i, j;
 
-	if (!ttm_dma)
+	if (!ttm_dma || !ttm_dma->dma_address)
 		return;
 	if (!ttm_dma->pages) {
 		NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);
@@ -587,7 +587,7 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
 	struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
 	int i, j;
 
-	if (!ttm_dma)
+	if (!ttm_dma || !ttm_dma->dma_address)
 		return;
 	if (!ttm_dma->pages) {
 		NV_DEBUG(drm, "ttm_dma 0x%p: pages NULL\n", ttm_dma);
-- 
2.25.1


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux