Re: slab leak on rxe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is the first time I meet this bug, haven't found the bug trigger yet.

We will kill the process in some situation using kill -9. Would it cause that?

Before this happens, there are some error report:

Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5
Feb 11 04:24:55  kernel: rdma_rxe: no qp matches qpn 0x31f5

On Tue, Feb 11, 2020 at 3:42 PM Zhu Yanjun <zyjzyj2000@xxxxxxxxx> wrote:
>
> Can this bug be reproduced?
>
> Zhu Yanjun
>
> On Tue, Feb 11, 2020 at 3:32 PM Frank Huang <tigerinxm@xxxxxxxxx> wrote:
> >
> > Re-post the log , sorry for the format.
> >
> > Feb 11 14:17:31  kernel:
> > =============================================================================
> > Feb 11 14:17:31  kernel: BUG rxe-qp (Tainted: G           OE  ):
> > Objects remaining in rxe-qp on __kmem_cache_shutdown()
> > Feb 11 14:17:31  kernel:
> > -----------------------------------------------------------------------------
> > Feb 11 14:17:31  kernel: Disabling lock debugging due to kernel taint
> > Feb 11 14:17:31  kernel: INFO: Slab 0xfffff4c4b027a000 objects=16
> > used=1 fp=0xffff96f3c9e83f00 flags=0x17ffffc0008100
> > Feb 11 14:17:31  kernel: CPU: 27 PID: 25588 Comm: rmmod Tainted: G
> > B      OE   4.14.97-.el7.centos.x86_64 #1
> > Feb 11 14:17:31  kernel: Hardware name: /80010056, BIOS 4.1.15 03/28/2017
> > Feb 11 14:17:31  kernel: Call Trace:
> > Feb 11 14:17:31  kernel:  dump_stack+0x5a/0x7b
> > Feb 11 14:17:31  kernel:  slab_err+0xb4/0xe0
> > Feb 11 14:17:31  kernel:  ? calibrate_delay+0x138/0x5f0
> > Feb 11 14:17:31  kernel:  ? on_each_cpu_mask+0x27/0x60
> > Feb 11 14:17:31  kernel:  ? on_each_cpu_cond+0xaf/0x140
> > Feb 11 14:17:31  kernel:  ? __kmalloc+0x179/0x200
> > Feb 11 14:17:31  kernel:  ? __kmem_cache_shutdown+0x194/0x3d0
> > Feb 11 14:17:31  kernel:  __kmem_cache_shutdown+0x1b4/0x3d0
> > Feb 11 14:17:31  kernel:  shutdown_cache+0x13/0x1b0
> > Feb 11 14:17:31  kernel:  kmem_cache_destroy+0x1e4/0x220
> > Feb 11 14:17:31  kernel:  rxe_cache_clean+0x41/0x60 [rdma_rxe]
> > Feb 11 14:17:31  kernel:  rxe_module_exit+0xf/0x68 [rdma_rxe]
> > Feb 11 14:17:31  kernel:  SyS_delete_module+0x175/0x270
> > Feb 11 14:17:31  kernel:  do_syscall_64+0x74/0x190
> > Feb 11 14:17:31  kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > Feb 11 14:17:31  kernel: RIP: 0033:0x7ff146d3f517
> > Feb 11 14:17:31  kernel: RSP: 002b:00007ffd4b5c1598 EFLAGS: 00000202
> > ORIG_RAX: 00000000000000b0
> > Feb 11 14:17:31  kernel: RAX: ffffffffffffffda RBX: 0000000000d78280
> > RCX: 00007ff146d3f517
> > Feb 11 14:17:31  kernel: RDX: 00007ff146db3ca0 RSI: 0000000000000800
> > RDI: 0000000000d782e8
> > Feb 11 14:17:31  kernel: RBP: 0000000000000000 R08: 00007ff147008060
> > R09: 00007ff146db3ca0
> > Feb 11 14:17:31  kernel: R10: 00007ffd4b5c1020 R11: 0000000000000202
> > R12: 00007ffd4b5c36ca
> > Feb 11 14:17:31  kernel: R13: 0000000000000000 R14: 0000000000d78280
> > R15: 0000000000d78010
> > Feb 11 14:17:31  kernel: INFO: Object 0xffff96f3c9e84ec0 @offset=20160
> > Feb 11 14:17:31  kernel: kmem_cache_destroy rxe-qp: Slab cache still has objects
> > Feb 11 14:17:31  kernel: CPU: 27 PID: 25588 Comm: rmmod Tainted: G
> > B      OE   4.14.97-.el7.centos.x86_64 #1
> > Feb 11 14:17:31  kernel: Hardware name: /80010056, BIOS 4.1.15 03/28/2017
> > Feb 11 14:17:31  kernel: Call Trace:
> > Feb 11 14:17:31  kernel:  dump_stack+0x5a/0x7b
> > Feb 11 14:17:31  kernel:  kmem_cache_destroy+0x203/0x220
> > Feb 11 14:17:31  kernel:  rxe_cache_clean+0x41/0x60 [rdma_rxe]
> > Feb 11 14:17:31  kernel:  rxe_module_exit+0xf/0x68 [rdma_rxe]
> > Feb 11 14:17:31  kernel:  SyS_delete_module+0x175/0x270
> > Feb 11 14:17:31  kernel:  do_syscall_64+0x74/0x190
> > Feb 11 14:17:31  kernel:  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > Feb 11 14:17:31  kernel: RIP: 0033:0x7ff146d3f517
> > Feb 11 14:17:31  kernel: RSP: 002b:00007ffd4b5c1598 EFLAGS: 00000202
> > ORIG_RAX: 00000000000000b0
> > Feb 11 14:17:31  kernel: RAX: ffffffffffffffda RBX: 0000000000d78280
> > RCX: 00007ff146d3f517
> > Feb 11 14:17:31  kernel: RDX: 00007ff146db3ca0 RSI: 0000000000000800
> > RDI: 0000000000d782e8
> >
> > On Tue, Feb 11, 2020 at 3:09 PM Frank Huang <tigerinxm@xxxxxxxxx> wrote:
> > >
> > > Hi, All
> > >
> > > When I use the old version of rdma_rxe (kernel 4.14.97), There is a
> > > slab leak of qp, is it fixed in newest version? I found the commit
> > > history on kernel.org, have not found same issue with it?
> > >
> > >
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > =============================================================================
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: BUG
> > > rxe-qp (Tainted: G           OE  ): Objects remaining in rxe-qp on
> > > __kmem_cache_shutdown()
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > -----------------------------------------------------------------------------
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: Disabling
> > > lock debugging due to kernel taint
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: INFO:
> > > Slab 0xfffff4c4b027a000 objects=16 used=1 fp=0xffff96f3c9e83f00
> > > flags=0x17ffffc0008100
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: CPU: 27
> > > PID: 25588 Comm: rmmod Tainted: G    B      OE
> > > 4.14.97-.el7.centos.x86_64 #1
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: Hardware
> > > name: 80010056, BIOS 4.1.15 03/28/2017
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: Call Trace:
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > dump_stack+0x5a/0x7b
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  slab_err+0xb4/0xe0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  ?
> > > calibrate_delay+0x138/0x5f0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  ?
> > > on_each_cpu_mask+0x27/0x60
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  ?
> > > on_each_cpu_cond+0xaf/0x140
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  ?
> > > __kmalloc+0x179/0x200
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:  ?
> > > __kmem_cache_shutdown+0x194/0x3d0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > __kmem_cache_shutdown+0x1b4/0x3d0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > shutdown_cache+0x13/0x1b0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > kmem_cache_destroy+0x1e4/0x220
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > rxe_cache_clean+0x41/0x60 [rdma_rxe]
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > rxe_module_exit+0xf/0x68 [rdma_rxe]
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > SyS_delete_module+0x175/0x270
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > do_syscall_64+0x74/0x190
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RIP:
> > > 0033:0x7ff146d3f517
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RSP:
> > > 002b:00007ffd4b5c1598 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RAX:
> > > ffffffffffffffda RBX: 0000000000d78280 RCX: 00007ff146d3f517
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RDX:
> > > 00007ff146db3ca0 RSI: 0000000000000800 RDI: 0000000000d782e8
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RBP:
> > > 0000000000000000 R08: 00007ff147008060 R09: 00007ff146db3ca0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: R10:
> > > 00007ffd4b5c1020 R11: 0000000000000202 R12: 00007ffd4b5c36ca
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: R13:
> > > 0000000000000000 R14: 0000000000d78280 R15: 0000000000d78010
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: INFO:
> > > Object 0xffff96f3c9e84ec0 @offset=20160
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > kmem_cache_destroy rxe-qp: Slab cache still has objects
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: CPU: 27
> > > PID: 25588 Comm: rmmod Tainted: G    B      OE
> > > 4.14.97-.el7.centos.x86_64 #1
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: Hardware
> > > name: 80010056, BIOS 4.1.15 03/28/2017
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: Call Trace:
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > dump_stack+0x5a/0x7b
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > kmem_cache_destroy+0x203/0x220
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > rxe_cache_clean+0x41/0x60 [rdma_rxe]
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > rxe_module_exit+0xf/0x68 [rdma_rxe]
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > SyS_delete_module+0x175/0x270
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > do_syscall_64+0x74/0x190
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel:
> > > entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RIP:
> > > 0033:0x7ff146d3f517
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RSP:
> > > 002b:00007ffd4b5c1598 EFLAGS: 00000202 ORIG_RAX: 00000000000000b0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RAX:
> > > ffffffffffffffda RBX: 0000000000d78280 RCX: 00007ff146d3f517
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RDX:
> > > 00007ff146db3ca0 RSI: 0000000000000800 RDI: 0000000000d782e8
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: RBP:
> > > 0000000000000000 R08: 00007ff147008060 R09: 00007ff146db3ca0
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: R10:
> > > 00007ffd4b5c1020 R11: 0000000000000202 R12: 00007ffd4b5c36ca
> > > Feb 11 14:17:31 57c4c63f-e817-4e22-aec9-72bc376b757c kernel: R13:
> > > 0000000000000000 R14: 0000000000d78280 R15: 0000000000d78010



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux