I find that ttm_bo_swapout tries to swap the first BO from the swap LRU list. But I don't find what removes a BO from that list. It looks like the same BO can be swapped again, even when it's already swapped. What am I missing? Should swapped out BOs still be on the swap LRU list? In that case we need to add a condition in ttm_bo_swapout to prevent swapping out a BO that's already swapped. Regards, Â Felix On 2018-01-17 06:50 PM, Felix Kuehling wrote: > On 2018-01-17 03:33 PM, Andrey Grodzovsky wrote: >> I have a private libdrm amdgpu test which allocates very big BOs in >> loop until all VRAM, GTT and swap are full, and I don't release them >> in the test (yet) . >> >> Once the test process terminates everything always gets cleared >> including swap . Could this point to KFD specific issue ? > That's possible. > > I added some WARNs: > > diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c > index 5a046a3..d68141e 100644 > --- a/drivers/gpu/drm/ttm/ttm_tt.c > +++ b/drivers/gpu/drm/ttm/ttm_tt.c > @@ -175,7 +175,7 @@ void ttm_tt_destroy(struct ttm_tt *ttm) > > if (ttm->state == tt_unbound) > ttm_tt_unpopulate(ttm); > - > + WARN_ON(ttm->page_flags & TTM_PAGE_FLAG_PERSISTENT_SWAP); > if (!(ttm->page_flags & TTM_PAGE_FLAG_PERSISTENT_SWAP) && > ttm->swap_storage) > fput(ttm->swap_storage); > @@ -321,6 +321,7 @@ int ttm_tt_swapin(struct ttm_tt *ttm) > > return 0; > out_err: > + WARN(1, "Returning error, not freeing swap_storage"); > return ret; > } > > @@ -336,7 +337,8 @@ int ttm_tt_swapout(struct ttm_tt *ttm, struct file *persistent_swap_storage) > BUG_ON(ttm->state != tt_unbound && ttm->state != tt_unpopulated); > BUG_ON(ttm->caching_state != tt_cached); > > - if (!persistent_swap_storage) { > + if (!persistent_swap_storage) { > + WARN(ttm->swap_storage, "already has swap storage"); > swap_storage = shmem_file_setup("ttm swap", > ttm->num_pages << PAGE_SHIFT, > 0); > > > And noticed that ttm_bo_swapout is getting called on BOs that already > have swap space. I think that means it's trying to swap out a BO that's > already swapped out, and that's where it's leaking the pointer to > already allocated swap space: > > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602083] ------------[ cut here ]------------ > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602086] already has swap storage > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602124] WARNING: CPU: 8 PID: 1940 at /home/fkuehlin/compute/kernel/drivers/gpu/drm/ttm/t > tm_tt.c:341 ttm_tt_swapout+0x230/0x250 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602126] Modules linked in: ip6_tables(E) ip_tables(E) x_tables(E) x86_pkg_temp_thermal(E > ) amdkfd(E) amd_iommu_v2(E) amdgpu(E) chash(E) gpu_sched(E) ttm(E) > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602139] CPU: 8 PID: 1940 Comm: kworker/u24:6 Tainted: G W E 4.15.0-rc2-kfd-fkuehlin #7 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602141] Hardware name: ASUS All Series/X99-E WS/USB 3.1, BIOS 2006 04/07/2016 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602144] Workqueue: ttm_swap ttm_shrink_work [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602147] task: 00000000894fffc6 task.stack: 000000008f73bd43 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602150] RIP: 0010:ttm_tt_swapout+0x230/0x250 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602151] RSP: 0018:ffffa87a43633ce8 EFLAGS: 00010296 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602153] RAX: 0000000000000018 RBX: ffff90ba3af6b858 RCX: 0000000000000006 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602154] RDX: 0000000000001027 RSI: ffff90bbe3306df8 RDI: 0000000000000202 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602155] RBP: ffff90bbddabde00 R08: 0000000000000000 R09: 0000000000000000 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602156] R10: 000000005b2635d0 R11: 0000000000000000 R12: ffff90ba3af6b88c > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602157] R13: ffffa87a43633e3a R14: ffff90bbdda59d70 R15: ffff90bbdda59ce0 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602159] FS: 0000000000000000(0000) GS:ffff90bbe7400000(0000) knlGS:0000000000000000 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602160] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602161] CR2: 0000000001afbd58 CR3: 00000003f5010002 CR4: 00000000001606e0 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602162] Call Trace: > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602171] ttm_bo_swapout+0x23a/0x260 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602175] ? ttm_shrink+0xa8/0xf0 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602179] ttm_shrink+0xb6/0xf0 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602184] ttm_shrink_work+0x31/0x40 [ttm] > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602189] process_one_work+0x19d/0x430 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602191] ? process_one_work+0x136/0x430 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602196] worker_thread+0x45/0x430 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602202] kthread+0x134/0x170 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602204] ? process_one_work+0x430/0x430 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602206] ? kthread_delayed_work_timer_fn+0x80/0x80 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602212] ret_from_fork+0x24/0x30 > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602218] Code: 89 45 40 31 c0 e9 34 ff ff ff 48 c7 c7 a8 ea 13 c0 e8 0b ff f8 ed 8b 44 24 08 e9 1f ff ff ff 48 c7 c7 de fa 13 c0 e8 d0 85 f2 ed <0f> ff eb 80 48 89 ef e8 14 fc ff ff 48 8b 04 24 48 89 45 40 8b > Jan 17 18:40:06 fkuehlin-hsatest2 kernel: [ 196.602268] ---[ end trace 019b6398cabc8266 ]--- > > Regards, > Â Felix > > >> Thanks, >> >> Andrey >> >> >> On 01/16/2018 10:21 PM, Felix Kuehling wrote: >>> I'm running an eviction stress test with KFD and find that sometimes it >>> starts swapping. When that happens, swap usage goes up rapidly, but it >>> never comes down. Even after the processes terminate, and all VRAM and >>> GTT allocations are freed (checked in >>> /sys/kernel/debug/dri/0/amdgpu_{gtt|vram}_mm), swap space is still not >>> released. >>> >>> Running the test repeatedly I was able to trigger the OOM killer quite >>> easily. The system died with a panic, running out of processes to kill. >>> >>> The symptoms look like swap space is only allocated but never released. >>> >>> A quick look at the swapping code in ttm_tt.c doesn't show any obvious >>> problems. I'm assuming that fput should free swap space. That should >>> happen when BOs are swapped back in, or destroyed. As far as I can tell, >>> amdgpu doesn't use persistent swap space, so I'm ignoring >>> TTM_PAGE_FLAG_PERSISTENT_SWAP. >>> >>> Any other ideas or pointers? >>> >>> Thanks, >>> Â Â Felix >>>