On 06/27/2017 08:45 AM, huang ying wrote: > On Sat, Jun 24, 2017 at 6:04 AM, <akpm@xxxxxxxxxxxxxxxxxxxx> wrote: >> * mm-page_allocc-eliminate-unsigned-confusion-in-__rmqueue_fallback.patch > > After git bisecting, find the above patch will cause the following bug > on i386 with memory eater + swap. > > [ 10.657876] BUG: unable to handle kernel paging request at 001fe2b8 > [ 10.658412] IP: set_pfnblock_flags_mask+0x50/0x80 > [ 10.658779] *pde = 00000000 > [ 10.658779] > [ 10.659126] Oops: 0000 [#1] SMP > [ 10.659372] CPU: 0 PID: 1403 Comm: usemem Not tainted 4.12.0-rc6-mm1+ #12 > [ 10.659888] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.10.2-1 04/01/2014 > [ 10.660522] task: f54a4c40 task.stack: f54ee000 > [ 10.660878] EIP: set_pfnblock_flags_mask+0x50/0x80 > [ 10.661246] EFLAGS: 00010006 CPU: 0 > [ 10.661517] EAX: 0007f8ae EBX: 00000000 ECX: 00000009 EDX: 00000200 > [ 10.661994] ESI: 001fe2b8 EDI: 00000e00 EBP: f54efd8c ESP: f54efd80 > [ 10.662473] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 > [ 10.662891] CR0: 80050033 CR2: 001fe2b8 CR3: 356a3000 CR4: 00000690 > [ 10.663378] Call Trace: > [ 10.663577] set_pageblock_migratetype+0x31/0x40 > [ 10.663933] __rmqueue+0x367/0x560 > [ 10.664197] get_page_from_freelist+0x5b7/0x8e0 > [ 10.664546] __alloc_pages_nodemask+0x31a/0x1000 > [ 10.664913] ? handle_mm_fault+0x1e8/0x840 > [ 10.665230] handle_mm_fault+0x71d/0x840 > [ 10.665537] __do_page_fault+0x175/0x400 > [ 10.665848] ? vmalloc_sync_all+0x190/0x190 > [ 10.666173] do_page_fault+0xb/0x10 > [ 10.666446] common_exception+0x64/0x6a > [ 10.666742] EIP: 0x8005e04c > [ 10.666959] EFLAGS: 00010246 CPU: 0 > [ 10.667229] EAX: 07d47400 EBX: 80063000 ECX: bfc964d8 EDX: 67179000 > [ 10.667705] ESI: 07d47400 EDI: 07d47400 EBP: 00000000 ESP: bfc962cc > [ 10.668180] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b > [ 10.668595] ? vmalloc_sync_all+0x190/0x190 > [ 10.668922] Code: 8b 5b 28 25 00 fc ff ff 29 c1 89 c8 b9 1f 00 00 > 00 2b 4d 08 c1 e8 0a c1 e0 02 89 c6 c1 e8 05 83 e6 1f 29 f1 8d 34 83 > d3 e7 d3 e2 <8b> 1e f7 d7 eb 0c 8d 76 00 8d bc 27 00 00 00 00 89 c3 89 > d9 89 > [ 10.670369] EIP: set_pfnblock_flags_mask+0x50/0x80 SS:ESP: 0068:f54efd80 > [ 10.670881] CR2: 00000000001fe2b8 > [ 10.671140] ---[ end trace f51518af57e6b531 ]--- > > I think this comes from the signed and unsigned int comparison on > i386. The gcc version is, Yes, the unsigned vs signed comparison is wrong, and effectively the same problem as the previous wrong attempt, which removed the order >= 0 condition. Thanks for the report. However, the patch in mmotm seems to be missing this crucial hunk that Rasmus had in the patch he sent [1]: -__rmqueue_fallback(struct zone *zone, unsigned int order, int start_migratetype) +__rmqueue_fallback(struct zone *zone, int order, int start_migratetype) which makes this a signed vs signed comparison. What happened to it? Andrew? [1] http://lkml.kernel.org/r/20170621185529.2265-1-linux@xxxxxxxxxxxxxxxxxx > gcc (Debian 6.3.0-18) 6.3.0 20170516 > > Best Regards, > Huang, Ying > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> >