On Thu, 26 Jun 2014, Sasha Levin wrote: > On 06/25/2014 06:36 PM, Hugh Dickins wrote: > > Sasha, may I trespass on your time, and ask you to revert the previous > > patch from your tree, and give this patch below a try? I am very > > interested to learn if in fact it fixes it for you (as it did for me). > > Hi Hugh, > > Happy to help, Thank you! Though Vlastimil has made it clear that we cannot go forward with that one-liner patch, so I've just proposed another. > and as I often do I will answer with a question. > > I've observed two different issues after reverting the original fix and > applying this new patch. Both of them seem semi-related, but I'm not sure. I've rather concentrated on putting the new patch together, and just haven't had time to give these two much thought - nor shall tomorrow, I'm afraid. > > First, this: > > [ 681.267487] BUG: unable to handle kernel paging request at ffffea0003480048 > [ 681.268621] IP: zap_pte_range (mm/memory.c:1132) Weird, I don't think we've seen anything like that before, have we? I'm pretty sure it's not a consequence of my "index = min(index, end)", but what it portends I don't know. Please confirm mm/memory.c:1132 - that's the "if (PageAnon(page))" line, isn't it? Which indeed matches the code below. So accessing page->mapping is causing an oops... > [ 681.269335] PGD 37fcc067 PUD 37fcb067 PMD 0 > [ 681.269972] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > [ 681.270952] Dumping ftrace buffer: > [ 681.270952] (ftrace buffer empty) > [ 681.270952] Modules linked in: > [ 681.270952] CPU: 7 PID: 1952 Comm: trinity-c29 Not tainted 3.16.0-rc2-next-20140625-s > asha-00025-g2e02e05-dirty #730 > [ 681.270952] task: ffff8803e6f58000 ti: ffff8803df050000 task.ti: ffff8803df050000 > [ 681.270952] RIP: zap_pte_range (mm/memory.c:1132) > [ 681.270952] RSP: 0018:ffff8803df053c58 EFLAGS: 00010246 > [ 681.270952] RAX: ffffea0003480040 RBX: ffff8803edae7a70 RCX: 0000000003480040 > [ 681.270952] RDX: 00000000d2001730 RSI: 0000000000000000 RDI: 00000000d2001730 > [ 681.270952] RBP: ffff8803df053cf8 R08: ffff88000015cc00 R09: 0000000000000000 > [ 681.270952] R10: 0000000000000001 R11: 0000000000000000 R12: ffffea0003480040 > [ 681.270952] R13: ffff8803df053de8 R14: 00007fc15014f000 R15: 00007fc15014e000 > [ 681.270952] FS: 00007fc15031b700(0000) GS:ffff8801ece00000(0000) knlGS:0000000000000 > 000 > [ 681.270952] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > [ 681.270952] CR2: ffffea0003480048 CR3: 000000001a02e000 CR4: 00000000000006a0 > [ 681.270952] Stack: > [ 681.270952] ffff8803df053de8 00000000d2001000 00000000d2001fff ffff8803e6f58000 > [ 681.270952] 0000000000000000 0000000000000001 ffff880404dd8400 ffff8803e6e31900 > [ 681.270952] 00000000d2001730 ffff88000015cc00 0000000000000000 ffff8804078f8000 > [ 681.270952] Call Trace: > [ 681.270952] unmap_single_vma (mm/memory.c:1256 mm/memory.c:1277 mm/memory.c:1301 mm/m > emory.c:1346) > [ 681.270952] unmap_vmas (mm/memory.c:1375 (discriminator 1)) > [ 681.270952] exit_mmap (mm/mmap.c:2797) > [ 681.270952] ? preempt_count_sub (kernel/sched/core.c:2606) > [ 681.270952] mmput (kernel/fork.c:638) > [ 681.270952] do_exit (kernel/exit.c:744) > [ 681.270952] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 681.270952] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2557 kernel/locking/ > lockdep.c:2599) > [ 681.270952] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607) > [ 681.270952] do_group_exit (kernel/exit.c:884) > [ 681.270952] SyS_exit_group (kernel/exit.c:895) > [ 681.270952] tracesys (arch/x86/kernel/entry_64.S:542) > [ 681.270952] Code: e8 cf 39 25 03 49 8b 4c 24 10 48 39 c8 74 1c 48 8b 7d b8 48 c1 e1 0c > 48 89 da 48 83 c9 40 4c 89 fe e8 e5 db ff ff 0f 1f 44 00 00 <41> f6 44 24 08 01 74 08 83 6d c8 01 eb 33 66 90 f6 45 a0 40 74 > All code > ======== > 0: e8 cf 39 25 03 callq 0x32539d4 > 5: 49 8b 4c 24 10 mov 0x10(%r12),%rcx > a: 48 39 c8 cmp %rcx,%rax > d: 74 1c je 0x2b > f: 48 8b 7d b8 mov -0x48(%rbp),%rdi > 13: 48 c1 e1 0c shl $0xc,%rcx > 17: 48 89 da mov %rbx,%rdx > 1a: 48 83 c9 40 or $0x40,%rcx > 1e: 4c 89 fe mov %r15,%rsi > 21: e8 e5 db ff ff callq 0xffffffffffffdc0b > 26: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > 2b:* 41 f6 44 24 08 01 testb $0x1,0x8(%r12) <-- trapping instruction > 31: 74 08 je 0x3b > 33: 83 6d c8 01 subl $0x1,-0x38(%rbp) > 37: eb 33 jmp 0x6c > 39: 66 90 xchg %ax,%ax > 3b: f6 45 a0 40 testb $0x40,-0x60(%rbp) > 3f: 74 00 je 0x41 > > Code starting with the faulting instruction > =========================================== > 0: 41 f6 44 24 08 01 testb $0x1,0x8(%r12) > 6: 74 08 je 0x10 > 8: 83 6d c8 01 subl $0x1,-0x38(%rbp) > c: eb 33 jmp 0x41 > e: 66 90 xchg %ax,%ax > 10: f6 45 a0 40 testb $0x40,-0x60(%rbp) > 14: 74 00 je 0x16 > [ 681.270952] RIP zap_pte_range (mm/memory.c:1132) > [ 681.270952] RSP <ffff8803df053c58> > [ 681.270952] CR2: ffffea0003480048 > > And a longer lockup that shows a few shmem_fallocate hanging, but they don't seem to be > the main reason for the hang (log it pretty long, attached). I wandered through the log you attached, but didn't have much idea of what I should look for, and set it aside to get on with other things. I think it was just confirming what Vlastimil indicated, that the one-liner patch is not good enough: lots of hole-punches waiting their turn for i_mutex, though I didn't see as many corresponding faults into those holes as perhaps I expected. Your mm/memory.c:1132 is more intriguing, but right now I can't afford to get very intrigued by it! Hugh -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html