Re: kernel null pointer at nvme_tcp_init_iter[nvme_tcp] with blktests nvme-tcp/012

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2/7/21 1:25 PM, Chaitanya Kulkarni wrote:
Yi,

On 2/6/21 20:51, Yi Zhang wrote:
The issue was introduced after merge the NVMe updates

commit 0fd6456fd1f4c8f3ec5a2df6ed7f34458a180409 (HEAD)
Merge: 44d10e4b2f2c 0d7389718c32
Author: Jens Axboe <axboe@xxxxxxxxx>
Date:   Tue Feb 2 07:12:06 2021 -0700

      Merge branch 'for-5.12/drivers' into for-next

      * for-5.12/drivers: (22 commits)
        nvme-tcp: use cancel tagset helper for tear down
        nvme-rdma: use cancel tagset helper for tear down
        nvme-tcp: add clean action for failed reconnection
        nvme-rdma: add clean action for failed reconnection
        nvme-core: add cancel tagset helpers
        nvme-core: get rid of the extra space
        nvme: add tracing of zns commands
        nvme: parse format nvm command details when tracing
        nvme: update enumerations for status codes
        nvmet: add lba to sect conversion helpers
        nvmet: remove extra variable in identify ns
        nvmet: remove extra variable in id-desclist
        nvmet: remove extra variable in smart log nsid
        nvme: refactor ns->ctrl by request
        nvme-tcp: pass multipage bvec to request iov_iter
        nvme-tcp: get rid of unused helper function
        nvme-tcp: fix wrong setting of request iov_iter
        nvme: support command retry delay for admin command
        nvme: constify static attribute_group structs
        nvmet-fc: use RCU proctection for assoc_list


On 2/6/21 11:08 AM, Yi Zhang wrote:
blktests nvme-tcp/012
Thanks for reporting, you can further bisect this using following
NVMe tree if is from NVMEe tree :- http://git.infradead.org/nvme.git
Branch :- 5.12
I've tried that branch, but encountered another NULL pointer.

[  224.290720] run blktests nvme/012 at 2021-02-08 01:46:25
[  224.515479] BUG: kernel NULL pointer dereference, address: 0000000000000368
[  224.522442] #PF: supervisor read access in kernel mode
[  224.527590] #PF: error_code(0x0000) - not-present page
[  224.532737] PGD 0 P4D 0
[  224.535276] Oops: 0000 [#1] SMP PTI
[  224.538770] CPU: 0 PID: 2403 Comm: multipath Tainted: G S I       5.11.0-rc5+ #1 [  224.546776] Hardware name: Dell Inc. PowerEdge R640/08HT8T, BIOS 2.10.0 11/12/2020
[  224.554340] RIP: 0010:bio_associate_blkg_from_css+0xbd/0x2c0
[  224.560009] Code: 03 75 79 65 48 ff 00 e8 b1 94 cc ff e8 ac 94 cc ff 49 89 5c 24 48 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 44 24 08 <48> 8b 808
[  224.578761] RSP: 0018:ffffad748ec5bd78 EFLAGS: 00010246
[  224.583988] RAX: 0000000000000000 RBX: ffffa0e96452b3c0 RCX: 0000000000000000 [  224.591120] RDX: ffffa11077088000 RSI: ffffffffa06edb20 RDI: ffffa0e96452b3c0 [  224.598253] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000002 [  224.605385] R10: 0000000000000022 R11: 0000000000000c10 R12: ffffa0e96452b3c0 [  224.612516] R13: ffffffffa06edb20 R14: 0000000000001000 R15: 0000000000000000 [  224.619649] FS:  00007fb6647feb00(0000) GS:ffffa107f0000000(0000) knlGS:0000000000000000
[  224.627734] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  224.633479] CR2: 0000000000000368 CR3: 0000002884b54002 CR4: 00000000007706f0 [  224.640611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [  224.647751] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  224.654884] PKRU: 55555554
[  224.657595] Call Trace:
[  224.660043]  bio_associate_blkg+0x20/0x70
[  224.664055]  nvme_submit_user_cmd+0xd7/0x340 [nvme_core]
[  224.669375]  nvme_user_cmd.isra.82+0x120/0x190 [nvme_core]
[  224.674859]  nvme_dev_ioctl+0x13b/0x150 [nvme_core]
[  224.679738]  __x64_sys_ioctl+0x84/0xc0
[  224.683500]  do_syscall_64+0x33/0x40
[  224.687088]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  224.692149] RIP: 0033:0x7fb66338161b
[  224.695726] Code: 0f 1e fa 48 8b 05 6d b8 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 018 [  224.714469] RSP: 002b:00007ffeae59aba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [  224.722036] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fb66338161b [  224.729168] RDX: 00007ffeae59abb0 RSI: 00000000c0484e41 RDI: 0000000000000003 [  224.736300] RBP: 00007ffeae59ac10 R08: 0000559af0bb3d20 R09: 00007fb66340e580 [  224.743431] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffeae59bcb0 [  224.750563] R13: 0000559af0bb3d20 R14: 0000000000000003 R15: 0000000000000000 [  224.757697] Modules linked in: loop nvme_tcp nvme_fabrics nvme_core nvmet_tcp nvmet rfkill sunrpc dm_multipath intel_rapl_msr intel_rapl_common isst_if_cod
[  224.833264] CR2: 0000000000000368
[  224.836604] ---[ end trace fa77e7cc26f8cfc1 ]---
[  224.883915] RIP: 0010:bio_associate_blkg_from_css+0xbd/0x2c0
[  224.889584] Code: 03 75 79 65 48 ff 00 e8 b1 94 cc ff e8 ac 94 cc ff 49 89 5c 24 48 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e 41 5f c3 49 8b 44 24 08 <48> 8b 808
[  224.908329] RSP: 0018:ffffad748ec5bd78 EFLAGS: 00010246
[  224.913554] RAX: 0000000000000000 RBX: ffffa0e96452b3c0 RCX: 0000000000000000 [  224.920686] RDX: ffffa11077088000 RSI: ffffffffa06edb20 RDI: ffffa0e96452b3c0 [  224.927818] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000002 [  224.934948] R10: 0000000000000022 R11: 0000000000000c10 R12: ffffa0e96452b3c0 [  224.942082] R13: ffffffffa06edb20 R14: 0000000000001000 R15: 0000000000000000 [  224.949213] FS:  00007fb6647feb00(0000) GS:ffffa107f0000000(0000) knlGS:0000000000000000
[  224.957301] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  224.963044] CR2: 0000000000000368 CR3: 0000002884b54002 CR4: 00000000007706f0 [  224.970177] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [  224.977309] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  224.984441] PKRU: 55555554
[  224.987155] Kernel panic - not syncing: Fatal exception
[  225.425279] Kernel Offset: 0x1d400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[  225.478268] ---[ end Kernel panic - not syncing: Fatal exception ]---

Looking at the commit log it seems like nvme-tcp commits might be
the issue given that you got the problem in nvme_tcp_init_iter()
and just eliminating by the looking at the code
commit 9f99a29e5307 can lead to this behavior.

*I may be completely wrong as I'm not expert in the nvme-tcp host.*



_______________________________________________
Linux-nvme mailing list
Linux-nvme@xxxxxxxxxxxxxxxxxxx
http://lists.infradead.org/mailman/listinfo/linux-nvme





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux