On Mon, Oct 30, 2017 at 6:19 PM, Cong Wang <xiyou.wangcong@xxxxxxxxx> wrote: > > 1. The faulty addresses are all near 0000000100000000, with one exception > of null (which is the most recent one) Well, they're at 8(%rax), except for that last case. And in every case (_including_ that last case), %rax has a very interesting pattern.. That's the (bad) buf->ops pointer that was loaded from the somehow corrupted "buf". The values in all cases are 00000000fffffffa 00000000fffffffd 00000000fffffff1 00000000fffffff7 00000000fffffff4 00000000fffffffa 00000000fffffffd 00000000fffffffd 00000000fffffffa 00000000ffffffe8 00000000fffffff1 00000000fffffff7 which kind of looks like a 32-bit error value. So we have (n, val, (errno)): 1 -24 (EMFILE) 2 -15 (ENOTBLK) 1 -12 (ENOMEM) 2 -9 (EBADF) 3 -6 (ENXIO) 3 -3 (ESRCH) none of which makes any sense to me, but it's an interesting pattern nonetheless. > 2. R12 register, which should map to the local vairable 'i', is always 0x8 > at the time of crash. So _if_ this is some kind of use-after-free thing, and the allocation got re-used for something else, that might just be related to whatever ends up being the offset that is filled in with the (int) error number. Except the offset is that %r12*0x28+0x10, so we're talking a byte offset of 330 bytes into the allocation, and apparently the eight previous (0-7) iterations were fine. Which is really odd. I'm not seeing anything that makes sense. I'll have to think about this. I'm assuming you don't have slub debugging enabled, and no way to enable it and try to catch this? Linus