Re: [EXPERIMENTAL v1 0/4] RDMA loopback device

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Mar 10, 2019 at 07:37:56PM +0000, Parav Pandit wrote:
> 
> 
> > -----Original Message-----
> > From: Parav Pandit
> > Sent: Sunday, March 10, 2019 2:24 PM
> > To: 'Yuval Shaia' <yuval.shaia@xxxxxxxxxx>
> > Cc: Bart Van Assche <bvanassche@xxxxxxx>; Ira Weiny
> > <ira.weiny@xxxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>; Dennis
> > Dalessandro <dennis.dalessandro@xxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx;
> > Marcel Apfelbaum <marcel.apfelbaum@xxxxxxxxx>; Kamal Heib
> > <kheib@xxxxxxxxxx>
> > Subject: RE: [EXPERIMENTAL v1 0/4] RDMA loopback device
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Yuval Shaia <yuval.shaia@xxxxxxxxxx>
> > > Sent: Sunday, March 10, 2019 2:23 PM
> > > To: Parav Pandit <parav@xxxxxxxxxxxx>
> > > Cc: Bart Van Assche <bvanassche@xxxxxxx>; Ira Weiny
> > > <ira.weiny@xxxxxxxxx>; Leon Romanovsky <leon@xxxxxxxxxx>; Dennis
> > > Dalessandro <dennis.dalessandro@xxxxxxxxx>;
> > > linux-rdma@xxxxxxxxxxxxxxx; Marcel Apfelbaum
> > > <marcel.apfelbaum@xxxxxxxxx>; Kamal Heib <kheib@xxxxxxxxxx>
> > > Subject: Re: [EXPERIMENTAL v1 0/4] RDMA loopback device
> > >
> > > > (hint, as a starting point please provide a fix to avoid crash in
> > > > memory
> > > registration in rxe:-) ).
> > >
> > > I'm not aware of a crash in memory registration, can you describe the
> > > steps to reproduce?
> > >
> > ib_send_bw -x 1 -d rxe0 -a
> > ib_send_bw -x 1 -d rxe0 -a <ip_address>
> I did a quick run now on 5.0.0.-rc7, it is not crashing, which used to crash for me on 5.0.0.-rc5.
> Seems better now.
> 
> Its running at 1.6Gbps compare to loopback at 50Gbps, but hey we can ignore the 50x performance. :-)

No, we can't ignore it - this is a huge motivation to enhance RXE with memcpy!!

> 
> With write bw I hit a hit soft lockup,
> kernel:watchdog: BUG: soft lockup - CPU#63 stuck for 22s! [ksoftirqd/63:328]
> 
> kernel: irq event stamp: 354570533
> kernel: hardirqs last  enabled at (354570532): [<ffffffff92c23f12>] _raw_read_unlock_irqrestore+0x32/0x60
> kernel: hardirqs last disabled at (354570533): [<ffffffff92403717>] trace_hardirqs_off_thunk+0x1a/0x1c
> kernel: softirqs last  enabled at (20353810): [<ffffffff93000325>] __do_softirq+0x325/0x3cf
> kernel: softirqs last disabled at (20353815): [<ffffffff9249eea5>] run_ksoftirqd+0x35/0x50
> kernel: CPU: 32 PID: 173 Comm: ksoftirqd/32 Kdump: loaded Tainted: G             L    5.0.0-rc7-vdevbus+ #2
> kernel: Hardware name: Supermicro SYS-6028U-TR4+/X10DRU-i+, BIOS 2.0b 08/09/2016
> kernel: rxe_responder+0x941/0x1ff0 [rdma_rxe]
> kernel: ? __lock_acquire+0x240/0xf60
> kernel: ? find_held_lock+0x31/0xa0
> kernel: ? find_held_lock+0x31/0xa0
> kernel: ? rxe_do_task+0x7e/0xf0 [rdma_rxe]
> kernel: ? _raw_spin_unlock_irqrestore+0x32/0x51
> kernel: rxe_do_task+0x85/0xf0 [rdma_rxe]
> kernel: rxe_rcv+0x346/0x840 [rdma_rxe]
> kernel: ? copy_data+0x113/0x240 [rdma_rxe]
> kernel: rxe_requester+0x7c8/0x1060 [rdma_rxe]
> kernel: rxe_do_task+0x85/0xf0 [rdma_rxe]
> kernel: tasklet_action_common.isra.19+0x187/0x1a0
> kernel: __do_softirq+0xd0/0x3cf
> kernel: run_ksoftirqd+0x35/0x50
> kernel: smpboot_thread_fn+0xfe/0x150
> kernel: kthread+0xf5/0x130
> kernel: ? sort_range+0x20/0x20
> kernel: ? kthread_bind+0x10/0x10
> kernel: ret_from_fork+0x24/0x30
> kernel: rcu: INFO: rcu_sched self-detected stall on CPU
> kernel: rcu: #01132-....: (64452 ticks this GP) idle=586/1/0x4000000000000002 softirq=184257/184259 fqs=16251
> kernel: rcu: #011 (t=65008 jiffies g=8870789 q=3260)
> kernel: NMI backtrace for cpu 32
> kernel: CPU: 32 PID: 173 Comm: ksoftirqd/32 Kdump: loaded Tainted: G             L    5.0.0-rc7-vdevbus+ #2

Is this the dump from rc5 or it is still happening with rc7?

> 



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux