Hi all,
I am testing NVMe over fabrics on linux-4.17.0-rc4 (on
CentOS Linux release 7.4) with Soft RoCE as transport. I was using to nvme-cli to connect to the NVMe target over fabrics. Was successful in connecting and listing the device. ./nvme connect -t rdma -n testsubsystem -a 15.15.15.2 -s 4420
./nvme list Node SN Model Namespace Usage Format FW Rev ---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 be76ebbcf555a121 Linux 1 400.09 GB / 400.09 GB 512 B + 0 B 4.17.0-r But when I use "fio" to do random write to the NVMe device I see a kernel warning and after some time the target server is in accessible. fio --filename=/dev/nvme0n1 --ioengine=libaio --direct=1 --norandommap --randrepeat=0 --runtime=600 --blocksize=4K --rw=randwrite --iodepth=32 --numjobs=8 --group_reporting --name=myjob
----------------------------------------------------------------------------- Jul 24 20:08:54 compute-559 kernel: CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.17.0-rc4 #1
----------------------------------------------------Jul 24 20:08:54 compute-559 kernel: Hardware name: Dell Inc. PowerEdge T130/06FW8M, BIOS 2.1.4 04/13/2017 Jul 24 20:08:54 compute-559 kernel: RIP: 0010:__local_bh_enable_ip+0x35/0x60 Jul 24 20:08:54 compute-559 kernel: RSP: 0018:ffff9889afd43a78 EFLAGS: 00010006 Jul 24 20:08:54 compute-559 kernel: RAX: 0000000080010200 RBX: ffff98898e80aa08 RCX: 0000000000000000 Jul 24 20:08:54 compute-559 kernel: RDX: 000000000000003c RSI: 0000000000000200 RDI: ffffffffc015bbb2 Jul 24 20:08:54 compute-559 kernel: RBP: ffff98899b44fc1e R08: 0000000000000001 R09: ffff98899a892a00 Jul 24 20:08:54 compute-559 kernel: R10: ffff9889977163c0 R11: ffffffffc09d1300 R12: ffff98898e80aa78 Jul 24 20:08:54 compute-559 kernel: R13: ffffffffc0160618 R14: ffff9888fdfe1d00 R15: ffff98898f702000 Jul 24 20:08:54 compute-559 kernel: FS: 0000000000000000(0000) GS:ffff9889afd40000(0000) knlGS:0000000000000000 Jul 24 20:08:54 compute-559 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jul 24 20:08:54 compute-559 kernel: CR2: 00007f2c7b5565b0 CR3: 00000003be00a001 CR4: 00000000003606e0 Jul 24 20:08:54 compute-559 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jul 24 20:08:54 compute-559 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Jul 24 20:08:54 compute-559 kernel: Call Trace: Jul 24 20:08:54 compute-559 kernel: <IRQ> Jul 24 20:08:54 compute-559 kernel: ipt_do_table+0x34e/0x650 [ip_tables] Jul 24 20:08:54 compute-559 kernel: ? unwind_get_return_address+0x1c/0x30 Jul 24 20:08:54 compute-559 kernel: ? __save_stack_trace+0x75/0x100 Jul 24 20:08:54 compute-559 kernel: ? nf_ct_get_tuple+0x61/0xa0 [nf_conntrack] Jul 24 20:08:54 compute-559 kernel: ? udp_packet+0x79/0x80 [nf_conntrack] Jul 24 20:08:54 compute-559 kernel: ? nf_conntrack_in+0x1ba/0x540 [nf_conntrack] Jul 24 20:08:54 compute-559 kernel: iptable_mangle_hook+0x7d/0xf0 [iptable_mangle] Jul 24 20:08:54 compute-559 kernel: nf_hook_slow+0x3d/0xb0 Jul 24 20:08:54 compute-559 kernel: __ip_local_out+0xf6/0x120 Jul 24 20:08:54 compute-559 kernel: ? neigh_key_eq32+0x10/0x10 Jul 24 20:08:54 compute-559 kernel: ip_local_out+0x17/0x40 Jul 24 20:08:54 compute-559 kernel: rxe_send+0x9a/0x110 [rdma_rxe] Jul 24 20:08:54 compute-559 kernel: rxe_requester+0x97e/0x11f0 [rdma_rxe] Jul 24 20:08:54 compute-559 kernel: rxe_do_task+0x8b/0x100 [rdma_rxe] Jul 24 20:08:54 compute-559 kernel: rxe_post_send+0x3f4/0x550 [rdma_rxe] Jul 24 20:08:54 compute-559 kernel: nvmet_rdma_queue_response+0xeb/0x1a0 [nvmet_rdma] Jul 24 20:08:54 compute-559 kernel: ? i40e_clean_rx_irq+0x3b5/0xcf0 [i40e] Jul 24 20:08:54 compute-559 kernel: nvmet_req_complete+0x11/0x40 [nvmet] Jul 24 20:08:54 compute-559 kernel: nvmet_bio_done+0x2b/0x40 [nvmet] Jul 24 20:08:54 compute-559 kernel: blk_update_request+0x95/0x2f0 Jul 24 20:08:54 compute-559 kernel: blk_mq_end_request+0x1a/0xc0 Jul 24 20:08:54 compute-559 kernel: blk_mq_complete_request+0xa1/0x110 Jul 24 20:08:54 compute-559 kernel: nvme_irq+0x12f/0x1e0 [nvme] Jul 24 20:08:54 compute-559 kernel: __handle_irq_event_percpu+0x40/0x1a0 Jul 24 20:08:54 compute-559 kernel: handle_irq_event_percpu+0x30/0x70 Jul 24 20:08:54 compute-559 kernel: handle_irq_event+0x36/0x60 Jul 24 20:08:54 compute-559 kernel: handle_edge_irq+0x90/0x190 Jul 24 20:08:54 compute-559 kernel: handle_irq+0xb1/0x130 Jul 24 20:08:54 compute-559 kernel: ? tick_irq_enter+0x9c/0xb0 Jul 24 20:08:54 compute-559 kernel: do_IRQ+0x43/0xd0 Jul 24 20:08:54 compute-559 kernel: common_interrupt+0xf/0xf Please any one of you let me know a way out. Thanks for the support
Regards, Pradeep. |