On Mon, Jul 15, 2019 at 06:51:06PM +0200, Michel Dänzer wrote: > > With a KASAN enabled kernel built from amd-staging-drm-next, the > attached use-after-free is pretty reliably detected during a piglit gpu run. Does this branch you are testing have the hmm.git merged? I think from the name it does not? Use after free's of this nature were something that was fixed in hmm.git.. I don't see an obvious way you can hit something like this with the new code arrangement.. > P.S. With my standard kernels without KASAN (currently 5.2.y + drm-next > changes for 5.3), I'm having trouble lately completing a piglit run, > running into various issues which look like memory corruption, so might > be related. I'm skeptical that the AMDGPU implementation of the locking around the hmm_range & mirror is working, it doesn'r follow the perscribed pattern at least. > Jul 15 18:09:29 kaveri kernel: [ 560.388751][T12568] ================================================================== > Jul 15 18:09:29 kaveri kernel: [ 560.389063][T12568] BUG: KASAN: use-after-free in __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389068][T12568] Read of size 8 at addr ffff88835e1c7cb0 by task amd_pinned_memo/12568 > Jul 15 18:09:29 kaveri kernel: [ 560.389071][T12568] > Jul 15 18:09:29 kaveri kernel: [ 560.389077][T12568] CPU: 9 PID: 12568 Comm: amd_pinned_memo Tainted: G OE 5.2.0-rc1-00811-g2ad5a7d31bdf #125 > Jul 15 18:09:29 kaveri kernel: [ 560.389080][T12568] Hardware name: Micro-Star International Co., Ltd. MS-7A34/B350 TOMAHAWK (MS-7A34), BIOS 1.80 09/13/2017 > Jul 15 18:09:29 kaveri kernel: [ 560.389084][T12568] Call Trace: > Jul 15 18:09:29 kaveri kernel: [ 560.389091][T12568] dump_stack+0x7c/0xc0 > Jul 15 18:09:29 kaveri kernel: [ 560.389097][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389101][T12568] print_address_description+0x65/0x22e > Jul 15 18:09:29 kaveri kernel: [ 560.389106][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389110][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389115][T12568] __kasan_report.cold.3+0x1a/0x3d > Jul 15 18:09:29 kaveri kernel: [ 560.389122][T12568] ? __mmu_notifier_release+0x286/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389128][T12568] kasan_report+0xe/0x20 > Jul 15 18:09:29 kaveri kernel: [ 560.389132][T12568] __mmu_notifier_release+0x286/0x3e0 So we are iterating over the mn list and touched free'd memory > Jul 15 18:09:29 kaveri kernel: [ 560.389309][T12568] Allocated by task 12568: > Jul 15 18:09:29 kaveri kernel: [ 560.389314][T12568] save_stack+0x19/0x80 > Jul 15 18:09:29 kaveri kernel: [ 560.389318][T12568] __kasan_kmalloc.constprop.8+0xc1/0xd0 > Jul 15 18:09:29 kaveri kernel: [ 560.389323][T12568] hmm_get_or_create+0x8f/0x3f0 The memory is probably a struct hmm > Jul 15 18:09:29 kaveri kernel: [ 560.389857][T12568] Freed by task 12568: > Jul 15 18:09:29 kaveri kernel: [ 560.389860][T12568] save_stack+0x19/0x80 > Jul 15 18:09:29 kaveri kernel: [ 560.389864][T12568] __kasan_slab_free+0x125/0x170 > Jul 15 18:09:29 kaveri kernel: [ 560.389867][T12568] kfree+0xe2/0x290 > Jul 15 18:09:29 kaveri kernel: [ 560.389871][T12568] __mmu_notifier_release+0xef/0x3e0 > Jul 15 18:09:29 kaveri kernel: [ 560.389875][T12568] exit_mmap+0x93/0x400 And the free was also done in notifier_release (presumably the backtrace is corrupt and this is really in the old hmm_release -> hmm_put -> hmm_free -> kfree call chain) Which was not OK, as __mmu_notifier_release doesn't use a 'safe' hlist iterator, so the release callback can never trigger kfree of a struct mmu_notifier. The new hmm.git code does not call kfree from release, it schedules that through a SRCU which won't run until __mmu_notifier_release returns, by definition. So should be fixed. Jason