Hi Dave, > On Sun, 28 Dec 2014 22:51:27 +1100 Dave Chinner wrote: > > On Wed, Dec 24, 2014 at 11:14:03AM +0100, Bruno Prémont wrote: > > > On a server I've got the following traces, the first on Monday, the second > > > one today. On Monday kernel was 3.14.17 and 3.14.27 for today (both captured > > > via netconsole). > > > > > > Is that fixed in a newer kernel? > > > > > > I've xfs_repaired one of the two XFS partitions on the server though it > > > found nothing to complain about. The other partition, containing /, has > > > not been explicitly checked yet. > > > > > > If there is some information I should gather before xfs_repairing, please > > > tell as soon as possible! > > > > > > > > > Thanks, > > > Bruno > > > > > > [6149136.014757] general protection fault: 0000 [#1] SMP > > > [6149136.022825] Modules linked in: netconsole configfs > > > [6149136.028996] CPU: 4 PID: 151 Comm: kworker/4:1H Not tainted 3.14.18-x86_64 #1 > > > [6149136.040750] Hardware name: HP ProLiant DL360 G6, BIOS P64 07/02/2013 > > > [6149136.048936] Workqueue: xfslogd xfs_buf_iodone_work > > > [6149136.056836] task: ffff880212c67500 ti: ffff8800def3c000 task.ti: ffff8800def3c000 > > > [6149136.067023] RIP: 0010:[<ffffffff81255b67>] [<ffffffff81255b67>] xfs_trans_ail_delete_bulk+0x87/0x1a0 > > > [6149136.080940] RSP: 0018:ffff8800def3dce8 EFLAGS: 00010202 > > > [6149136.088889] RAX: dead000000100100 RBX: ffff88000211bd10 RCX: ffff88010e23fbb1 > > > [6149136.098962] RDX: 6b6b6b6b6b6b6b6b RSI: 6b6b6b6b6b6b6b6b RDI: ffff88000211bd10 > > > [6149136.110787] RBP: ffff8800def3dd38 R08: 6b6b6b6b6b6b6b6b R09: 2900000000000000 > > > > You have memory poisoning turned on? > > > > #define POISON_FREE 0x6b /* for use-after-free poisoning */ > > Yes, I do. > > > Did this occur at unmount? Can you reproduce it on a 3.18 kernel? > > No, it happens at runtime (apparently triggered/made likely by backup > daemon reading through the filesystem, but not each time). > > Though that server is always busy writing to the disks (so backup > makes it even more busy). > It has two XFS partitions, one root partition including /var/ > and a second data partition, both being written to (the data partition > more aggressively that the root one - root partition receives some > deal of logging). It happens rather often, yesterday it happened once again, still during autonomous operation of the affected server. This looks like it triggers more or less once every two weeks. I'm going to switch to a more recent kernel (3.18.y) in the hope it has been fixed there. In case it is of some help, here is the objdumped xfs_trans_ail_delete_bulk: 0000000000000a70 <xfs_trans_ail_delete_bulk>: a70: 55 push %rbp a71: 48 8d 47 10 lea 0x10(%rdi),%rax a75: 48 89 e5 mov %rsp,%rbp a78: 41 57 push %r15 a7a: 41 56 push %r14 a7c: 41 55 push %r13 a7e: 41 54 push %r12 a80: 45 31 e4 xor %r12d,%r12d a83: 53 push %rbx a84: 48 89 fb mov %rdi,%rbx a87: 48 83 ec 18 sub $0x18,%rsp a8b: 89 4d c4 mov %ecx,-0x3c(%rbp) a8e: 48 89 c1 mov %rax,%rcx a91: 48 89 45 c8 mov %rax,-0x38(%rbp) a95: 48 8b 47 10 mov 0x10(%rdi),%rax a99: 48 39 c1 cmp %rax,%rcx a9c: 4c 0f 45 e0 cmovne %rax,%r12 aa0: 85 d2 test %edx,%edx aa2: 0f 8e 30 01 00 00 jle bd8 <xfs_trans_ail_delete_bulk+0x168> aa8: 4c 8b 36 mov (%rsi),%r14 aab: 41 f6 46 34 01 testb $0x1,0x34(%r14) ab0: 0f 84 ca 00 00 00 je b80 <xfs_trans_ail_delete_bulk+0x110> ab6: 4c 8d 6e 08 lea 0x8(%rsi),%r13 aba: 83 ea 01 sub $0x1,%edx abd: 45 31 ff xor %r15d,%r15d ac0: 49 8d 44 d5 00 lea 0x0(%r13,%rdx,8),%rax ac5: 48 89 45 d0 mov %rax,-0x30(%rbp) ac9: eb 18 jmp ae3 <xfs_trans_ail_delete_bulk+0x73> acb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ad0: 4d 8b 75 00 mov 0x0(%r13),%r14 ad4: 49 83 c5 08 add $0x8,%r13 ad8: 41 f6 46 34 01 testb $0x1,0x34(%r14) add: 0f 84 9d 00 00 00 je b80 <xfs_trans_ail_delete_bulk+0x110> ae3: 48 b8 00 01 10 00 00 movabs $0xdead000000100100,%rax aea: 00 ad de aed: 49 8b 36 mov (%r14),%rsi af0: 48 89 df mov %rbx,%rdi af3: 49 8b 56 08 mov 0x8(%r14),%rdx af7: 48 89 56 08 mov %rdx,0x8(%rsi) ^^^^^^^^^^^ afb: 48 89 32 mov %rsi,(%rdx) afe: 4c 89 f6 mov %r14,%rsi b01: 49 89 06 mov %rax,(%r14) b04: 48 b8 00 02 20 00 00 movabs $0xdead000000200200,%rax b0b: 00 ad de b0e: 49 89 46 08 mov %rax,0x8(%r14) b12: e8 69 f5 ff ff callq 80 <xfs_trans_ail_cursor_clear.constprop.9> b17: b8 01 00 00 00 mov $0x1,%eax b1c: 49 c7 46 10 00 00 00 movq $0x0,0x10(%r14) b23: 00 b24: 41 83 66 34 fe andl $0xfffffffe,0x34(%r14) b29: 4d 39 e6 cmp %r12,%r14 b2c: 44 0f 44 f8 cmove %eax,%r15d b30: 4c 3b 6d d0 cmp -0x30(%rbp),%r13 b34: 75 9a jne ad0 <xfs_trans_ail_delete_bulk+0x60> b36: 45 85 ff test %r15d,%r15d b39: 0f 84 99 00 00 00 je bd8 <xfs_trans_ail_delete_bulk+0x168> b3f: 48 8b 3b mov (%rbx),%rdi b42: f6 87 60 02 00 00 10 testb $0x10,0x260(%rdi) b49: 0f 84 9c 00 00 00 je beb <xfs_trans_ail_delete_bulk+0x17b> b4f: 48 8b 45 c8 mov -0x38(%rbp),%rax b53: 48 3b 43 10 cmp 0x10(%rbx),%rax b57: 0f 84 98 00 00 00 je bf5 <xfs_trans_ail_delete_bulk+0x185> b5d: 80 43 40 01 addb $0x1,0x40(%rbx) b61: 48 8b 3b mov (%rbx),%rdi b64: e8 00 00 00 00 callq b69 <xfs_trans_ail_delete_bulk+0xf9> b69: 48 83 c4 18 add $0x18,%rsp b6d: 5b pop %rbx b6e: 41 5c pop %r12 b70: 41 5d pop %r13 b72: 41 5e pop %r14 b74: 41 5f pop %r15 b76: 5d pop %rbp b77: c3 retq b78: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1) b7f: 00 b80: 4c 8b 23 mov (%rbx),%r12 b83: 80 43 40 01 addb $0x1,0x40(%rbx) b87: 41 f6 84 24 60 02 00 testb $0x10,0x260(%r12) b8e: 00 10 b90: 75 d7 jne b69 <xfs_trans_ail_delete_bulk+0xf9> b92: 4c 89 e7 mov %r12,%rdi b95: 48 c7 c1 00 00 00 00 mov $0x0,%rcx b9c: be 04 00 00 00 mov $0x4,%esi ba1: 48 c7 c2 00 00 00 00 mov $0x0,%rdx ba8: 31 c0 xor %eax,%eax baa: e8 00 00 00 00 callq baf <xfs_trans_ail_delete_bulk+0x13f> baf: 8b 75 c4 mov -0x3c(%rbp),%esi bb2: 4c 89 e7 mov %r12,%rdi bb5: b9 dc 02 00 00 mov $0x2dc,%ecx bba: 48 c7 c2 00 00 00 00 mov $0x0,%rdx bc1: e8 00 00 00 00 callq bc6 <xfs_trans_ail_delete_bulk+0x156> bc6: 48 83 c4 18 add $0x18,%rsp bca: 5b pop %rbx bcb: 41 5c pop %r12 bcd: 41 5d pop %r13 bcf: 41 5e pop %r14 bd1: 41 5f pop %r15 bd3: 5d pop %rbp bd4: c3 retq bd5: 0f 1f 00 nopl (%rax) bd8: 80 43 40 01 addb $0x1,0x40(%rbx) bdc: 48 83 c4 18 add $0x18,%rsp be0: 5b pop %rbx be1: 41 5c pop %r12 be3: 41 5d pop %r13 be5: 41 5e pop %r14 be7: 41 5f pop %r15 be9: 5d pop %rbp bea: c3 retq beb: e8 00 00 00 00 callq bf0 <xfs_trans_ail_delete_bulk+0x180> bf0: e9 5a ff ff ff jmpq b4f <xfs_trans_ail_delete_bulk+0xdf> bf5: 48 8d 7b 68 lea 0x68(%rbx),%rdi Thanks, Bruno _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs