Re: Bug report: vfio over kernel 5.19 - mm area

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 15/06/2022 17:02, Alex Williamson wrote:
On Wed, 15 Jun 2022 13:52:10 +0300
Yishai Hadas <yishaih@xxxxxxxxxx> wrote:

Adding some extra relevant people from the MM area.

On 15/06/2022 13:43, Yishai Hadas wrote:
Hi All,

Any idea what could cause the below break in 5.19 ? we run QEMU and
immediately the machine is stuck.

Once I run, echo l > /proc/sysrq-trigger could see the below task
which seems to be stuck..

This basic flow worked fine in 5.18.
Spent Friday bisecting this and posted this fix:

https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen/

I expect you're hotting the same.  Thanks,

Alex

Alex,

It seems that we got the same bug again in V6.0 RC1 ..

The below code [1] from commit [2], put back the 'is_zero_pfn()' under the !(..) and seems buggy.

I would expect the below fix for that [3].

Alex Sierra,

Can you please review the below suggested fix for your patch and send a patch for RC2 accordingly ?

Yishai

[1]

See: https://elixir.bootlin.com/linux/v6.0-rc1/source/include/linux/mm.h#L1549

diff --git a/include/linux/mm.h b/include/linux/mm.h
index a2d01e49253b..64393ed3330a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -28,6 +28,7 @@
 #include <linux/sched.h>
 #include <linux/pgtable.h>
 #include <linux/kasan.h>
+#include <linux/memremap.h>

 struct mempolicy;
 struct anon_vma;
@@ -1537,7 +1538,9 @@ static inline bool is_longterm_pinnable_page(struct page *page)
        if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
                return false;
 #endif
-       return !is_zone_movable_page(page) || is_zero_pfn(page_to_pfn(page));
+       return !(is_device_coherent_page(page) ||
+                is_zone_movable_page(page) ||
+                is_zero_pfn(page_to_pfn(page)));
 }

[2] f25cbb7a95a24ff9a2a3bebd308e303942ae6b2c
Author: Alex Sierra <alex.sierra@xxxxxxx>
Date:   Fri Jul 15 10:05:10 2022 -0500

    mm: add zone device coherent type memory support


[3] Expected fix

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3bedc449c14d..b25f9886bd4c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1544,9 +1544,9 @@ static inline bool is_longterm_pinnable_page(struct page *page)
        if (mt == MIGRATE_CMA || mt == MIGRATE_ISOLATE)
                return false;
 #endif
-       return !(is_device_coherent_page(page) ||
-                is_zone_movable_page(page) ||
-                is_zero_pfn(page_to_pfn(page)));
+       return !is_device_coherent_page(page) ||
+              !is_zone_movable_page(page) ||
+              is_zero_pfn(page_to_pfn(page));
 }
 #else
 static inline bool is_longterm_pinnable_page(struct page *page)


[1162.056583] NMI backtrace for cpu 4
[ 1162.056585] CPU: 4 PID: 1979 Comm: qemu-system-x86 Not tainted
5.19.0-rc1 #747
[ 1162.056587] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 1162.056588] RIP: 0010:pmd_huge+0x0/0x20
[ 1162.056592] Code: 49 89 44 24 28 48 8b 47 30 49 89 44 24 30 31 c0
41 5c c3 5b b8 01 00 00 00 5d 41 5c c3 cc cc cc cc cc cc cc cc cc cc
cc cc cc <0f> 1f 44 00 00 31 c0 48 f7 c7 9f ff ff ff 74 0f 81 e7 81 00
00 00
[ 1162.056594] RSP: 0018:ffff888146253b38 EFLAGS: 00000202
[ 1162.056596] RAX: ffff888101461980 RBX: ffff888146253bc0 RCX:
000ffffffffff000
[ 1162.056597] RDX: ffff88814fa22000 RSI: 00007f9f68231000 RDI:
000000010a6b6067
[ 1162.056598] RBP: ffff888111b90dc0 R08: 000000000002f424 R09:
0000000000000001
[ 1162.056599] R10: ffffffff825c2a40 R11: 0000000000000a08 R12:
ffff88814fa22a08
[ 1162.056600] R13: 000000010a6b6067 R14: 0000000000052202 R15:
00007f9f68231000
[ 1162.056602] FS:  00007f9f6c228c40(0000) GS:ffff88885f900000(0000)
knlGS:0000000000000000
[ 1162.056605] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1162.056606] CR2: 00005643994fd0ed CR3: 00000001496da005 CR4:
0000000000372ea0
[ 1162.056607] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1162.056609] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 1162.056610] Call Trace:
[ 1162.056611]  <TASK>
[ 1162.056611]  follow_page_mask+0x196/0x5e0
[ 1162.056615]  __get_user_pages+0x190/0x5d0
[ 1162.056617]  ? flush_workqueue_prep_pwqs+0x110/0x110
[ 1162.056620]  __gup_longterm_locked+0xaf/0x470
[ 1162.056624]  vaddr_get_pfns+0x8e/0x240 [vfio_iommu_type1]
[ 1162.056628]  ? qi_flush_iotlb+0x83/0xa0
[ 1162.056631]  vfio_pin_pages_remote+0x326/0x460 [vfio_iommu_type1]
[ 1162.056634]  vfio_iommu_type1_ioctl+0x421/0x14f0 [vfio_iommu_type1]
[ 1162.056638]  __x64_sys_ioctl+0x3e4/0x8e0
[ 1162.056641]  do_syscall_64+0x3d/0x90
[ 1162.056644]  entry_SYSCALL_64_after_hwframe+0x46/0xb0
[ 1162.056646] RIP: 0033:0x7f9f6d14317b
[ 1162.056648] Code: 0f 1e fa 48 8b 05 1d ad 0c 00 64 c7 00 26 00 00
00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed ac 0c 00 f7 d8 64 89
01 48
[ 1162.056650] RSP: 002b:00007fff4fca15b8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 1162.056652] RAX: ffffffffffffffda RBX: 0000000000000001 RCX:
00007f9f6d14317b
[ 1162.056653] RDX: 00007fff4fca1620 RSI: 0000000000003b71 RDI:
000000000000001c
[ 1162.056654] RBP: 00007fff4fca1650 R08: 0000000000000001 R09:
0000000000000000
[ 1162.056655] R10: 0000000100000000 R11: 0000000000000246 R12:
0000000000000000
[ 1162.056656] R13: 0000000000000000 R14: 0000000000000000 R15:
0000000000000000
[ 1162.056657]  </TASK>

Yishai






[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux