Hi Thomas, On Mon, 2013-09-30 at 09:29 +0200, Thomas Glanzmann wrote: > Hello, > I wanted to check the open source VAAI branch so I checked out > 'for-next' of > git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git, > compiled, set it up and tried to create a VMFS file system on it. > > The target immediatly panic. The reason for that is a NULL pointer in a > spin lock and the reason for that seems to be short write (512 instead > of 1024 bytes). > Thanks for reporting. Comments below. > I rebuild the linux kernel with debugging enabled and reproduced the > issue: > > Sep 29 00:30:21 node-62 kernel: [ 288.714674] Entering lio_release_cmd for se_cmd: ffff88103d10e5d0 > Sep 29 00:30:21 node-62 kernel: [ 288.714678] Incremented dev->dev_cur_ordered_id: 532 for SIMPLE: 532 > Sep 29 00:30:21 node-62 kernel: [ 288.714680] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.721569] Updated MaxCmdSN to 0x00000bc8 > Sep 29 00:30:21 node-62 kernel: [ 288.721572] Built DataIN ITT: 0x00000c36, StatSN: 0x6e2c0ec8, DataSN: 0x00000000, Offset: 0, Length: 512, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.721587] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.721769] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.721771] Got SCSI Command, ITT: 0x00000c37, CmdSN: 0xb90b0000, ExpXferLen: 512, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.721775] Allocated se_ordered_id: 533 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.721776] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bba > Sep 29 00:30:21 node-62 kernel: [ 288.721785] Entering lio_release_cmd for se_cmd: ffff88103d10ef50 > Sep 29 00:30:21 node-62 kernel: [ 288.721788] Incremented dev->dev_cur_ordered_id: 533 for SIMPLE: 533 > Sep 29 00:30:21 node-62 kernel: [ 288.721789] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.728660] Updated MaxCmdSN to 0x00000bc9 > Sep 29 00:30:21 node-62 kernel: [ 288.728661] Built DataIN ITT: 0x00000c37, StatSN: 0x6e2c0ec9, DataSN: 0x00000000, Offset: 0, Length: 512, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.728669] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.728846] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.728849] Got SCSI Command, ITT: 0x00000c38, CmdSN: 0xba0b0000, ExpXferLen: 16384, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.728853] Allocated se_ordered_id: 534 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.728854] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bbb > Sep 29 00:30:21 node-62 kernel: [ 288.728866] Entering lio_release_cmd for se_cmd: ffff88103d10e5d0 > Sep 29 00:30:21 node-62 kernel: [ 288.728869] Incremented dev->dev_cur_ordered_id: 534 for SIMPLE: 534 > Sep 29 00:30:21 node-62 kernel: [ 288.728870] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.735740] Updated MaxCmdSN to 0x00000bca > Sep 29 00:30:21 node-62 kernel: [ 288.735742] Built DataIN ITT: 0x00000c38, StatSN: 0x6e2c0eca, DataSN: 0x00000000, Offset: 0, Length: 16384, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.735750] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.736201] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.736204] Got SCSI Command, ITT: 0x00000c39, CmdSN: 0xbb0b0000, ExpXferLen: 512, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.736207] Allocated se_ordered_id: 535 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.736209] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bbc > Sep 29 00:30:21 node-62 kernel: [ 288.736218] Entering lio_release_cmd for se_cmd: ffff88103d10ef50 > Sep 29 00:30:21 node-62 kernel: [ 288.736221] Incremented dev->dev_cur_ordered_id: 535 for SIMPLE: 535 > Sep 29 00:30:21 node-62 kernel: [ 288.736222] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.743093] Updated MaxCmdSN to 0x00000bcb > Sep 29 00:30:21 node-62 kernel: [ 288.743095] Built DataIN ITT: 0x00000c39, StatSN: 0x6e2c0ecb, DataSN: 0x00000000, Offset: 0, Length: 512, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.743103] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.743318] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.743321] Got SCSI Command, ITT: 0x00000c3a, CmdSN: 0xbc0b0000, ExpXferLen: 32, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.743324] Allocated se_ordered_id: 536 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.743326] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bbd > Sep 29 00:30:21 node-62 kernel: [ 288.743333] Entering lio_release_cmd for se_cmd: ffff88103d10e5d0 > Sep 29 00:30:21 node-62 kernel: [ 288.743336] Incremented dev->dev_cur_ordered_id: 536 for SIMPLE: 536 > Sep 29 00:30:21 node-62 kernel: [ 288.743337] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.750203] Updated MaxCmdSN to 0x00000bcc > Sep 29 00:30:21 node-62 kernel: [ 288.750205] Built DataIN ITT: 0x00000c3a, StatSN: 0x6e2c0ecc, DataSN: 0x00000000, Offset: 0, Length: 32, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.750213] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.750395] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.750398] Got SCSI Command, ITT: 0x00000c3b, CmdSN: 0xbd0b0000, ExpXferLen: 16, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.750402] Allocated se_ordered_id: 537 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.750403] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bbe > Sep 29 00:30:21 node-62 kernel: [ 288.750411] Entering lio_release_cmd for se_cmd: ffff88103d10ef50 > Sep 29 00:30:21 node-62 kernel: [ 288.750414] Incremented dev->dev_cur_ordered_id: 537 for SIMPLE: 537 > Sep 29 00:30:21 node-62 kernel: [ 288.750415] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.757288] Updated MaxCmdSN to 0x00000bcd > Sep 29 00:30:21 node-62 kernel: [ 288.757290] Built DataIN ITT: 0x00000c3b, StatSN: 0x6e2c0ecd, DataSN: 0x00000000, Offset: 0, Length: 16, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.757309] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.757487] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.757490] Got SCSI Command, ITT: 0x00000c3c, CmdSN: 0xbe0b0000, ExpXferLen: 512, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.757493] Allocated se_ordered_id: 538 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.757495] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bbf > Sep 29 00:30:21 node-62 kernel: [ 288.757504] Entering lio_release_cmd for se_cmd: ffff88103d10e5d0 > Sep 29 00:30:21 node-62 kernel: [ 288.757507] Incremented dev->dev_cur_ordered_id: 538 for SIMPLE: 538 > Sep 29 00:30:21 node-62 kernel: [ 288.757508] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.764380] Updated MaxCmdSN to 0x00000bce > Sep 29 00:30:21 node-62 kernel: [ 288.764381] Built DataIN ITT: 0x00000c3c, StatSN: 0x6e2c0ece, DataSN: 0x00000000, Offset: 0, Length: 512, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.764389] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.764563] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.764566] Got SCSI Command, ITT: 0x00000c3d, CmdSN: 0xbf0b0000, ExpXferLen: 512, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.764569] Allocated se_ordered_id: 539 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.764571] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bc0 > Sep 29 00:30:21 node-62 kernel: [ 288.764579] Entering lio_release_cmd for se_cmd: ffff88103d10ef50 > Sep 29 00:30:21 node-62 kernel: [ 288.764582] Incremented dev->dev_cur_ordered_id: 539 for SIMPLE: 539 > Sep 29 00:30:21 node-62 kernel: [ 288.764583] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.771454] Updated MaxCmdSN to 0x00000bcf > Sep 29 00:30:21 node-62 kernel: [ 288.771456] Built DataIN ITT: 0x00000c3d, StatSN: 0x6e2c0ecf, DataSN: 0x00000000, Offset: 0, Length: 512, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.771463] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.771640] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.771643] Got SCSI Command, ITT: 0x00000c3e, CmdSN: 0xc00b0000, ExpXferLen: 16384, Length: 0, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.771646] Allocated se_ordered_id: 540 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.771648] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bc1 > Sep 29 00:30:21 node-62 kernel: [ 288.771659] Entering lio_release_cmd for se_cmd: ffff88103d10e5d0 > Sep 29 00:30:21 node-62 kernel: [ 288.771662] Incremented dev->dev_cur_ordered_id: 540 for SIMPLE: 540 > Sep 29 00:30:21 node-62 kernel: [ 288.771664] DMA_FROM_DEVICE: cmd->se_lun->lun_sep_lock = ffff88103be9d020 > Sep 29 00:30:21 node-62 kernel: [ 288.778539] Updated MaxCmdSN to 0x00000bd0 > Sep 29 00:30:21 node-62 kernel: [ 288.778541] Built DataIN ITT: 0x00000c3e, StatSN: 0x6e2c0ed0, DataSN: 0x00000000, Offset: 0, Length: 16384, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.778548] tx_loop: 48, total_tx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.779027] rx_loop: 48, total_rx: 48, data: 48 > Sep 29 00:30:21 node-62 kernel: [ 288.779030] Got SCSI Command, ITT: 0x00000c3f, CmdSN: 0xc10b0000, ExpXferLen: 1024, Length: 1024, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.779033] Allocated se_ordered_id: 541 for Task Attr: 0x20 on fileio > Sep 29 00:30:21 node-62 kernel: [ 288.779039] rx_loop: 1024, total_rx: 1024, data: 1024 > Sep 29 00:30:21 node-62 kernel: [ 288.779040] Received CmdSN matches ExpCmdSN, incremented ExpCmdSN to: 0x00000bc2 > Sep 29 00:30:21 node-62 kernel: [ 288.779047] Entering lio_release_cmd for se_cmd: ffff88103d10ef50 > Sep 29 00:30:21 node-62 kernel: [ 288.779050] Incremented dev->dev_cur_ordered_id: 541 for SIMPLE: 541 > Sep 29 00:30:21 node-62 kernel: [ 288.779066] fd_do_rw() write returned 512 > Sep 29 00:30:21 node-62 kernel: [ 288.783159] -----[ Storage Engine Exception for cmd: ffff88103d10e5d0 ITT: 0x00000c3f CDB: 0x89 > Sep 29 00:30:21 node-62 kernel: [ 288.783161] -----[ i_state: 6 t_state: 5 sense_reason: 10 > Sep 29 00:30:21 node-62 kernel: [ 288.783168] -----[ CMD_T_ACTIVE: 1 CMD_T_STOP: 0 CMD_T_SENT: 0 > Sep 29 00:30:21 node-62 kernel: [ 288.783169] Incremented dev_cur_ordered_id: 542 for HEAD_OF_QUEUE: 541 > Sep 29 00:30:21 node-62 kernel: [ 288.783173] DMA_TO_DEVICE: cmd->se_lun->lun_sep_lock = 0000000000000020 > Sep 29 00:30:21 node-62 kernel: [ 288.789899] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 > Sep 29 00:30:21 node-62 kernel: [ 288.797941] IP: [<ffffffff8107ad66>] do_raw_spin_lock+0x5/0x1b > Sep 29 00:30:21 node-62 kernel: [ 288.803924] PGD 0 > Sep 29 00:30:21 node-62 kernel: [ 288.806073] Oops: 0002 [#1] SMP > Sep 29 00:30:21 node-62 kernel: [ 288.809506] Modules linked in: tcm_loop(O) tcm_fc(O) libfc scsi_transport_fc scsi_tgt iscsi_target_mod(O) target_core_pscsi(O) target_core_file(O) target_core_iblock(O) target_core_mod(O) configfs bridge 8021q garp stp llc bonding joydev hid_generic usbhid hid loop x86_pkg_temp_thermal coretemp kvm_intel kvm crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper ehci_pci ehci_hcd snd_pcm usbcore snd_page_alloc iTCO_wdt snd_timer iTCO_vendor_support snd psmouse evdev sb_edac soundcore acpi_cpufreq microcode edac_core mperf ioatdma serio_raw pcspkr i2c_i801 lpc_ich usb_common mfd_core processor wmi thermal_sys button ext4 crc16 jbd2 mbcache sg sd_mod crc_t10dif ahci libahci libata scsi_mod igb i2c_algo_bit i2c_core dca ptp pps_core > Sep 29 00:30:21 node-62 kernel: [ 288.883627] CPU: 0 PID: 293 Comm: kworker/0:3 Tainted: G O 3.11.0-rc5+ #2 > Sep 29 00:30:21 node-62 kernel: [ 288.891358] Hardware name: Supermicro X9SRD-F/X9SRD-F, BIOS 1.0a 10/15/2012 > Sep 29 00:30:21 node-62 kernel: [ 288.898386] Workqueue: target_completion target_complete_ok_work [target_core_mod] > Sep 29 00:30:21 node-62 kernel: [ 288.906099] task: ffff88102166c890 ti: ffff8810232c8000 task.ti: ffff8810232c8000 > Sep 29 00:30:21 node-62 kernel: [ 288.913675] RIP: 0010:[<ffffffff8107ad66>] [<ffffffff8107ad66>] do_raw_spin_lock+0x5/0x1b > Sep 29 00:30:21 node-62 kernel: [ 288.922091] RSP: 0018:ffff8810232c9e20 EFLAGS: 00010202 > Sep 29 00:30:21 node-62 kernel: [ 288.927483] RAX: 0000000000010000 RBX: ffff88103d10e768 RCX: 0000000000000000 > Sep 29 00:30:21 node-62 kernel: [ 288.934683] RDX: ffff88107fc0ee60 RSI: ffff88107fc0d408 RDI: 0000000000000020 > Sep 29 00:30:21 node-62 kernel: [ 288.941877] RBP: ffff88103d10e5d0 R08: 0000000000000000 R09: 0000000000004d00 > Sep 29 00:30:21 node-62 kernel: [ 288.949079] R10: 0000000000004d00 R11: 0000000000004d00 R12: ffff88107fc12740 > Sep 29 00:30:21 node-62 kernel: [ 288.956271] R13: ffffe8ffffc01a00 R14: 0000000000000000 R15: 0000000000000000 > Sep 29 00:30:21 node-62 kernel: [ 288.963474] FS: 0000000000000000(0000) GS:ffff88107fc00000(0000) knlGS:0000000000000000 > Sep 29 00:30:21 node-62 kernel: [ 288.971637] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > Sep 29 00:30:21 node-62 kernel: [ 288.977444] CR2: 0000000000000020 CR3: 000000000160b000 CR4: 00000000000407f0 > Sep 29 00:30:21 node-62 kernel: [ 288.984637] Stack: > Sep 29 00:30:21 node-62 kernel: [ 288.986709] ffffffffa03aa8eb 0000000000000000 ffff881021ce8ec0 ffff88103d10e768 > Sep 29 00:30:21 node-62 kernel: [ 288.994437] ffffffff8104a554 ffff8810232c9fd8 000000002166c890 0000000000000000 > Sep 29 00:30:21 node-62 kernel: [ 289.002167] ffff881021ce8ec0 ffff88107fc12740 ffff881021ce8ef0 ffff88107fc12758 > Sep 29 00:30:21 node-62 kernel: [ 289.009898] Call Trace: > Sep 29 00:30:21 node-62 kernel: [ 289.012429] [<ffffffffa03aa8eb>] ? target_complete_ok_work+0x136/0x238 [target_core_mod] > Sep 29 00:30:21 node-62 kernel: [ 289.020690] [<ffffffff8104a554>] ? process_one_work+0x191/0x28f > Sep 29 00:30:21 node-62 kernel: [ 289.026757] [<ffffffff8104aa01>] ? worker_thread+0x121/0x1e7 > Sep 29 00:30:21 node-62 kernel: [ 289.032565] [<ffffffff8104a8e0>] ? rescuer_thread+0x269/0x269 > Sep 29 00:30:21 node-62 kernel: [ 289.038463] [<ffffffff8104f441>] ? kthread+0x81/0x89 > Sep 29 00:30:21 node-62 kernel: [ 289.043579] [<ffffffff8104f3c0>] ? __kthread_parkme+0x5d/0x5d > Sep 29 00:30:21 node-62 kernel: [ 289.049480] [<ffffffff81378cec>] ? ret_from_fork+0x7c/0xb0 > Sep 29 00:30:21 node-62 kernel: [ 289.055115] [<ffffffff8104f3c0>] ? __kthread_parkme+0x5d/0x5d > Sep 29 00:30:21 node-62 kernel: [ 289.061008] Code: d0 f0 0f b1 0f 39 d0 0f 94 c0 0f b6 c0 c3 31 c0 48 81 ff 80 39 37 81 72 0c 31 c0 48 81 ff fe 3b 37 81 0f 92 c0 c3 b8 00 00 01 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 d0 74 07 f3 90 66 8b 07 eb f4 > Sep 29 00:30:21 node-62 kernel: [ 289.084152] RIP [<ffffffff8107ad66>] do_raw_spin_lock+0x5/0x1b > Sep 29 00:30:21 node-62 kernel: [ 289.090200] RSP <ffff8810232c9e20> > Sep 29 00:30:21 node-62 kernel: [ 289.093753] CR2: 0000000000000020 > Sep 29 00:30:21 node-62 kernel: [ 289.097181] ---[ end trace efe4b5cdc50ad28a ]--- > Sep 29 00:30:21 node-62 kernel: [ 289.103623] Updated MaxCmdSN to 0x00000bd1 > Sep 29 00:30:21 node-62 kernel: [ 289.107792] Built SCSI Response, ITT: 0x00000c3f, StatSN: 0x6e2c0ed1, Response: 0x02, SAM Status: 0x02, CID: 0 > Sep 29 00:30:21 node-62 kernel: [ 289.117885] Adding 2 bytes of padding to SENSE. > Sep 29 00:30:22 node-62 kernel: [ 289.122514] Attaching SENSE DATA: 98 bytes to iSCSI Response PDU > > The null pointer happens short before the: > > fd_do_rw() write returned 512 > > drivers/target/target_core_file.c line 298 > > if (is_write) { > if (ret < 0 || ret != cmd->data_length) { > pr_err("%s() write returned %d\n", __func__, ret); > return (ret < 0 ? ret : -EINVAL); > } > ... > So there are two separate bugs here. The first is specific to FILEIO, where the size of the submitted I/O (512) for COMPARE_AND_WRITE is different from the fabric received cmd->data_length (1024). So the (ret != cmd->data_length) checks should either be aware of the difference for the special COMPARE_AND_WRITE case, or cmd->data_length should be reset to the different value before the initial READ I/O submission. The second appears to be some manner of double completion in the failure path for the WRITE payload submission of COMPARE_AND_WRITE. In any event, I'm tracking these down now, and should be posting a bugfix series later this afternoon. --nab -- To unsubscribe from this list: send the line "unsubscribe target-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html