RE: [PATCH for-next v3 00/13] Implement work queues for rdma_rxe

"Daisuke Matsuda (Fujitsu)" <matsuda-daisuke@xxxxxxxxxxx> · Fri, 18 Nov 2022 05:02:15 +0000

On Sat, Oct 29, 2022 12:10 PM Bob Pearson wrote:

<...>

> Bob Pearson (13):
>   RDMA/rxe: Make task interface pluggable
>   RDMA/rxe: Split rxe_drain_resp_pkts()
>   RDMA/rxe: Simplify reset state handling in rxe_resp.c
>   RDMA/rxe: Handle qp error in rxe_resp.c
>   RDMA/rxe: Cleanup comp tasks in rxe_qp.c
>   RDMA/rxe: Remove __rxe_do_task()
>   RDMA/rxe: Make tasks schedule each other
>   RDMA/rxe: Implement disable/enable_task()
>   RDMA/rxe: Replace TASK_STATE_START by TASK_STATE_IDLE
>   RDMA/rxe: Replace task->destroyed by task state INVALID.
>   RDMA/rxe: Add workqueue support for tasks
>   RDMA/rxe: Make WORKQUEUE default for RC tasks
>   RDMA/rxe: Remove tasklets from rxe_task.c

Hello Bob,

I have found a soft lockup issue reproducible with rdma-core testcases.
It does not happen on the latest for-next tree but with this patch series,
so we need to fix the issue before getting it merged.

I did the test on a VM with 8 CPUs. I fetched the latest rdma-core and built it,
and executed the following command inside 'rdma-core' directory:
# while true; do ./build/bin/run_tests.py -v --dev rxe_ens6 --gid 1; sleep 2; done
(Please specify your 'dev' and 'gid' appropriately.)

Before 10 minutes passed, my console freezed and became unresponsive,
showing the test progress below:
=====
test_create_ah (tests.test_addr.AHTest)
Test ibv_create_ah. ... ok
test_create_ah_roce (tests.test_addr.AHTest)
Verify that AH can't be created without GRH in RoCE ... ok
test_destroy_ah (tests.test_addr.AHTest)
Test ibv_destroy_ah. ... ok
test_atomic_cmp_and_swap (tests.test_atomic.AtomicTest) ... ok
test_atomic_fetch_and_add (tests.test_atomic.AtomicTest) ... ok
test_atomic_invalid_lkey (tests.test_atomic.AtomicTest) ...
=====
Note this does not always happen. You may have to wait for some minutes.

Here is the backtrace:
=====
[ 1212.135650] watchdog: BUG: soft lockup - CPU#3 stuck for 1017s! [python3:3428]
[ 1212.138144] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rpcrdma rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm rdma_rxe ib_uverbs ip6_udp_tunnel udp_tunnel ib_core rfkill sunrpc intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass joydev nd_pmem virtio_balloon dax_pmem nd_btt i2c_piix4 pcspkr drm xfs libcrc32c sd_mod t10_pi sr_mod crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg nd_e820 ata_generic libnvdimm crct10dif_pclmul crc32_pclmul crc32c_intel ata_piix virtio_net libata ghash_clmulni_intel e1000 net_failover sha512_ssse3 failover virtio_console serio_raw dm_mirror dm_region_hash dm_log dm_mod fuse
[ 1212.147152] CPU: 3 PID: 3428 Comm: python3 Kdump: loaded Tainted: G             L     6.1.0-rc1+ #29
[ 1212.148425] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1212.149754] RIP: 0010:__local_bh_enable_ip+0x26/0x70
[ 1212.150464] Code: 00 00 66 90 0f 1f 44 00 00 65 8b 05 c4 7e d2 4e a9 00 00 0f 00 75 31 83 ee 01 f7 de 65 01 35 b1 7e d2 4e 65 8b 05 aa 7e d2 4e <a9> 00 ff ff 00 74 1b 65 ff 0d 9c 7e d2 4e 65 8b 05 95 7e d2 4e 85
[ 1212.153081] RSP: 0018:ffffaf0b8054baf8 EFLAGS: 00000203
[ 1212.153822] RAX: 0000000000000001 RBX: ffff8d2189171450 RCX: 00000000ffffffff
[ 1212.154823] RDX: 0000000000000001 RSI: 00000000fffffe00 RDI: ffffffffc0b51baa
[ 1212.155826] RBP: ffff8d2189171474 R08: 0000011a38ad4004 R09: 0000000000000101
[ 1212.156862] R10: ffffffffb2c06100 R11: 0000000000000000 R12: 0000000000000000
[ 1212.157860] R13: ffff8d218df5eda0 R14: ffff8d2181058328 R15: 0000000000000000
[ 1212.158858] FS:  00007f8cec55f740(0000) GS:ffff8d23b5cc0000(0000) knlGS:0000000000000000
[ 1212.159989] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1212.160837] CR2: 00007f8ceaa15024 CR3: 0000000108c2c002 CR4: 0000000000060ee0
[ 1212.161841] Call Trace:
[ 1212.162205]  <TASK>
[ 1212.162522]  work_cleanup+0x3a/0x40 [rdma_rxe]
[ 1212.163172]  rxe_qp_do_cleanup+0x54/0x1e0 [rdma_rxe]
[ 1212.163888]  execute_in_process_context+0x23/0x70
[ 1212.164575]  __rxe_cleanup+0xc6/0x170 [rdma_rxe]
[ 1212.165245]  rxe_destroy_qp+0x28/0x40 [rdma_rxe]
[ 1212.165909]  ib_destroy_qp_user+0x90/0x1b0 [ib_core]
[ 1212.166646]  uverbs_free_qp+0x35/0x90 [ib_uverbs]
[ 1212.167333]  destroy_hw_idr_uobject+0x1e/0x50 [ib_uverbs]
[ 1212.168103]  uverbs_destroy_uobject+0x37/0x1c0 [ib_uverbs]
[ 1212.168899]  uobj_destroy+0x3c/0x80 [ib_uverbs]
[ 1212.169532]  ib_uverbs_run_method+0x203/0x320 [ib_uverbs]
[ 1212.170412]  ? uverbs_free_qp+0x90/0x90 [ib_uverbs]
[ 1212.171151]  ib_uverbs_cmd_verbs+0x172/0x220 [ib_uverbs]
[ 1212.171912]  ? free_unref_page_commit+0x7e/0x170
[ 1212.172583]  ? xa_destroy+0x82/0x110
[ 1212.173104]  ? kvfree_call_rcu+0x27d/0x310
[ 1212.173692]  ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120
[ 1212.174481]  ib_uverbs_ioctl+0xa4/0x110 [ib_uverbs]
[ 1212.175182]  __x64_sys_ioctl+0x8a/0xc0
[ 1212.175725]  do_syscall_64+0x3b/0x90
[ 1212.176243]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 1212.176973] RIP: 0033:0x7f8cebe3ec6b
[ 1212.177489] Code: 73 01 c3 48 8b 0d b5 b1 1b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 85 b1 1b 00 f7 d8 64 89 01 48
[ 1212.180072] RSP: 002b:00007ffee64927a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 1212.181150] RAX: ffffffffffffffda RBX: 00007ffee64928d8 RCX: 00007f8cebe3ec6b
[ 1212.182131] RDX: 00007ffee64928c0 RSI: 00000000c0181b01 RDI: 0000000000000003
[ 1212.183099] RBP: 00007ffee64928a0 R08: 000056424f8c0c70 R09: 000000000002ae30
[ 1212.184064] R10: 00007f8cead75e10 R11: 0000000000000246 R12: 00007ffee649287c
[ 1212.185061] R13: 0000000000000022 R14: 000056424f8c0560 R15: 00007f8cebb903d0
[ 1212.186031]  </TASK>

Message from syslogd@c9ibremote at Nov 17 17:03:12 ...
 kernel:watchdog: BUG: soft lockup - CPU#3 stuck for 1017s! [python3:3428]
=====
It seems that ibv_destroy_qp(3) was issued from userspace, right?
I am not very familiar with uverbs ;(

The process got stuck at the do-while loop below:
=====
rxe_task.c
---
/* busy wait until any previous tasks are done */
static void cleanup_task(struct rxe_task *task)
{
        bool busy;

        do {
                spin_lock_bh(&task->lock);
                busy = (task->state == TASK_STATE_BUSY ||
                        task->state == TASK_STATE_ARMED);
                if (!busy)
                        task->state = TASK_STATE_INVALID;
                spin_unlock_bh(&task->lock);
        } while (busy);
}
=====
Typically task->state is 0 (i.e. "TASK_STATE_IDLE") when we reach here,
but in the infinite loop, task->state was constantly 1 (i.e. "TASK_STATE_BUSY").

IMO, the bottom halves completed their works leaving task->state "TASK_STATE_BUSY",
and ibv_destroy_qp(3) issued from userspace thereafter got stuck,
but I am not sure how this counter-intuitive state transition occurs.

Do you have any idea why this happens and how to fix this?
I cannot take enough time to inspect this issue further right now,
but I would update you if I find anything helpful to fix this.

Thanks,
Daisuke

> 
>  drivers/infiniband/sw/rxe/rxe.c      |   9 +-
>  drivers/infiniband/sw/rxe/rxe_comp.c |  24 ++-
>  drivers/infiniband/sw/rxe/rxe_qp.c   |  80 ++++-----
>  drivers/infiniband/sw/rxe/rxe_req.c  |   4 +-
>  drivers/infiniband/sw/rxe/rxe_resp.c |  70 +++++---
>  drivers/infiniband/sw/rxe/rxe_task.c | 258 +++++++++++++++++++--------
>  drivers/infiniband/sw/rxe/rxe_task.h |  56 +++---
>  7 files changed, 329 insertions(+), 172 deletions(-)
> 
> 
> base-commit: 692373d186205dfb1b56f35f22702412d94d9420
> --
> 2.34.1