Re: [PATCH] drm/radeon: fix handling of radeon_vm_bo_rmv v3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 18.07.2014 11:44, schrieb Michel Dänzer:
On 18.07.2014 15:56, Christian König wrote:
From: Christian König <christian.koenig@xxxxxxx>

v3: completely rewritten. We now just remember which areas
     of the PT to clear and do so on the next command submission.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=79980

Signed-off-by: Christian König <christian.koenig@xxxxxxx>
[...]

diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index eecff6b..2726b46 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -468,6 +469,15 @@ int radeon_vm_bo_set_addr(struct radeon_device *rdev,
  		head = &tmp->vm_list;
  	}
+ if (bo_va->soffset) {
+		/* add a clone of the bo_va to clear the old address */
+		tmp = kzalloc(sizeof(struct radeon_bo_va), GFP_KERNEL);
+		tmp->soffset = bo_va->soffset;
+		tmp->eoffset = bo_va->eoffset;
+		tmp->vm = vm;
+		INIT_LIST_HEAD(&tmp->vm_status);

?

Unnecessary, would be overwritten by the following list_add anyway.

+		list_add(&tmp->vm_status, &vm->freed);
+	}
[...]

-int radeon_vm_bo_rmv(struct radeon_device *rdev,
-		     struct radeon_bo_va *bo_va)
+void radeon_vm_bo_rmv(struct radeon_device *rdev,
+		      struct radeon_bo_va *bo_va)
  {
-	int r = 0;
+	struct radeon_vm *vm = bo_va->vm;
- mutex_lock(&bo_va->vm->mutex);
-	if (bo_va->soffset)
-		r = radeon_vm_bo_update(rdev, bo_va->vm, bo_va->bo, NULL);
+	list_del(&bo_va->bo_list);
+ mutex_lock(&vm->mutex);
  	list_del(&bo_va->vm_list);
-	mutex_unlock(&bo_va->vm->mutex);
-	list_del(&bo_va->bo_list);
Was there any particular reason for moving the list_del(&bo_va->bo_list)
outside of the VM mutex? I suspect this might be the cause of the
problem below, which I encountered after a few piglit runs.

Ah! Well it's not the root cause of the problem cause we released that lock before as well. But it changed the timing quite a bit and so brought the problem to the surface.

The root cause is that we don't reserve the IB-BO on every command submission, and so a concurrent creating or tear-down of a VM could modify the bo_list without holding a lock.

Thanks for the Info, going to address this in a second patch.

Regards,
Christian.



Other than those two issues, the patch looks good to me, and I haven't
seen any piglit GPU lockups with it on my Bonaire yet.


  general protection fault: 0000 [#1] SMP 153
  Modules linked in: bnep bluetooth rfkill snd_hrtimer snd_seq snd_seq_device cpufreq_stats cpufreq_powersave cpufreq_userspace cpufreq_conservative binfmt_misc nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc hid_generic usbhid hid snd_hda_codec_hdmi evdev snd_hda_codec_realtek snd_hda_codec_generic nls_utf8 nls_cp437 vfat fat kvm_amd kvm crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd microcode ohci_pci psmouse serio_raw pcspkr edac_mce_amd k10temp edac_core r8169 mii snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm snd_timer radeon(O) ttm snd sg drm_kms_helper i2c_piix4 ohci_hcd ehci_pci drm ehci_hcd soundcore i2c_algo_bit i2c_core acpi_cpufreq xhci_hcd video usbcore usb_common processor wmi button thermal_sys fuse autofs4 ext4 crc16 mbcache jbd2 dm_mod sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common ahci libahci libata scsi_mod
  CPU: 2 PID: 14721 Comm: shader_runner Tainted: G           O  3.16.0-rc5+ #143
  Hardware name: System manufacturer System Product Name/A88X-PRO, BIOS 1001 04/01/2014
  task: ffff8800083e4b10 ti: ffff880103bb8000 task.ti: ffff880103bb8000
  RIP: 0010:[<ffffffffc0573672>]  [<ffffffffc0573672>] radeon_vm_bo_find+0x13/0x22 [radeon]
  RSP: 0018:ffff880103bbbbf0  EFLAGS: 00010297
  RAX: dead000000100100 RBX: ffff8800dba24000 RCX: ef7bdef7bdef7bdf
  RDX: 000000000000a7b3 RSI: ffff8802120a9bb8 RDI: ffff8801f1d4f800
  RBP: ffff880103bbbbf0 R08: 0000000000000000 R09: ffffffff00000000
  R10: ffff880103bbba80 R11: ffff8800dba25eb0 R12: ffff880103bbbc18
  R13: ffff8801f1d4f800 R14: ffff8800dba24000 R15: 0000000000000000
  FS:  00007f640e4b7700(0000) GS:ffff88021e200000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f35bc858000 CR3: 00000001024a1000 CR4: 00000000000407e0
  Stack:
   ffff880103bbbdc0 ffffffffc04fb879 ffff8801f1d4f848 ffff880103bbbdf0
   ffff8800dba24018 ffff880215228098 ffff8800dba24000 ffff880003207c00
   0000000000000003 ffff8801f9c399c0 ffff8801fa08a120 0000001700000000
  Call Trace:
   [<ffffffffc04fb879>] radeon_cs_ioctl+0x4eb/0x677 [radeon]
   [<ffffffffc042638e>] drm_ioctl+0x366/0x42c [drm]
   [...]




_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel





[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux