On Sun, 2018-04-01 at 07:37 -0400, Wakko Warner wrote: > Bart Van Assche wrote: > > On Sat, 2018-03-31 at 18:12 -0400, Wakko Warner wrote: > > > Richard Weinberger wrote: > > > > On Sat, Mar 31, 2018 at 3:59 AM, Wakko Warner <wakko@xxxxxxxxxxxx> wrote: > > > > > I reported this before but noone responded. > > > > > > > > Because you're sending only to LKML. > > > > CC'ing storage folks. > > > > > > Thank you. I wasn't sure who I needed to send it to. > > > > Can you share the output of lsscsi? I would like to know whether or not you > > are using a (S)ATA CDROM. > > From the target: > [4:0:0:0] cd/dvd ATAPI iHAS224 B GL05 /dev/sr0 > [5:0:0:0] cd/dvd ATAPI iHAS422 8 4L11 /dev/sr1 > [6:0:0:0] cd/dvd PBDS DVD+-RW DH-16W1S 2D14 /dev/sr2 > > From the initiator: > [19:0:0:0] cd/dvd ATAPI iHAS224 B GL05 /dev/sr1 > [19:0:0:1] cd/dvd ATAPI iHAS422 8 4L11 /dev/sr2 > [19:0:0:2] cd/dvd PBDS DVD+-RW DH-16W1S 2D14 /dev/sr3 > > I tested 4.14.32 last night with the same oops. 4.9.91 works fine. > From the initiator, if I do cat /dev/sr1 > /dev/null it works. If I mount > /dev/sr1 and then do find -type f | xargs cat > /dev/null the target > crashes. I'm using the builtin iscsi target with pscsi. I can burn from > the initiator with out problems. I'll test other kernels between 4.9 and > 4.14. (+Lee and Chris) Hello Wakko, Although I'm not sure that what I ran into is exactly the same as what you ran into, there is definitely something wrong with what I encountered. What I ran into with Linus' latest master branch indicates two issues - one in the iSCSI initiator and one in the block layer: scsi 3:0:0:1: Direct-Access LIO-ORG FILEIO 4.0 PQ: 0 ANSI: 5 sd 2:0:0:1: [sdd] Attached SCSI disk sd 3:0:0:1: Warning! Received an indication that the LUN assignments on this target have changed. The Linux SCSI layer does not automatical sd 3:0:0:1: Attached scsi generic sg8 type 0 sd 3:0:0:1: [sdf] 128 512-byte logical blocks: (65.5 kB/64.0 KiB) sd 3:0:0:1: [sdf] Write Protect is off sd 3:0:0:1: [sdf] Mode Sense: 43 00 00 08 sd 3:0:0:1: [sdf] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA iSCSI/iqn.1993-08.org.debian:01:3b68b1b3d2eb: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION. sd 3:0:0:2: [sde] Attached SCSI disk sd 3:0:0:1: [sdf] Attached SCSI disk ===================================================== WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected 4.16.0-rc7-dbg+ #3 Not tainted ----------------------------------------------------- kworker/6:1H/155 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: (&(&session->frwd_lock)->rlock){+.-.}, at: [<000000007eb678ec>] iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] and this task is already holding: (&(&q->__queue_lock)->rlock){-.-.}, at: [<000000003c5841ec>] blk_timeout_work+0x45/0x220 which would create a new lock dependency: (&(&q->__queue_lock)->rlock){-.-.} -> (&(&session->frwd_lock)->rlock){+.-.} but this new dependency connects a HARDIRQ-irq-safe lock: (&(&q->__queue_lock)->rlock){-.-.} ... which became HARDIRQ-irq-safe at: _raw_spin_lock_irqsave+0x46/0x60 ata_qc_schedule_eh+0xb9/0x110 [libata] ata_sff_hsm_move+0x114/0xb10 [libata] __ata_sff_port_intr+0xec/0x1a0 [libata] ata_bmdma_port_intr+0xef/0x160 [libata] ata_bmdma_interrupt+0x160/0x2e0 [libata] __handle_irq_event_percpu+0x72/0x460 handle_irq_event_percpu+0x66/0xd0 handle_irq_event+0x54/0x90 handle_edge_irq+0xe9/0x2d0 handle_irq+0x17b/0x210 do_IRQ+0x6c/0x140 ret_from_intr+0x0/0x1e native_safe_halt+0x2/0x10 default_idle+0x1f/0x200 do_idle+0x1bc/0x1f0 cpu_startup_entry+0x19/0x20 start_kernel+0x47f/0x4a1 secondary_startup_64+0xa5/0xb0 to a HARDIRQ-irq-unsafe lock: (&(&session->frwd_lock)->rlock){+.-.} ... which became HARDIRQ-irq-unsafe at: ... _raw_spin_lock_bh+0x34/0x40 iscsi_conn_setup+0x239/0x320 [libiscsi] iscsi_tcp_conn_setup+0x14/0x80 [libiscsi_tcp] iscsi_sw_tcp_conn_create+0x1f/0x24a [iscsi_tcp] iscsi_if_rx+0x13c9/0x20d0 [scsi_transport_iscsi] netlink_unicast+0x299/0x330 netlink_sendmsg+0x435/0x580 ___sys_sendmsg+0x447/0x4d0 __sys_sendmsg+0xad/0x120 do_syscall_64+0xf3/0x2c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 other info that might help us debug this: Possible interrupt unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&session->frwd_lock)->rlock); local_irq_disable(); lock(&(&q->__queue_lock)->rlock); lock(&(&session->frwd_lock)->rlock); <Interrupt> lock(&(&q->__queue_lock)->rlock); *** DEADLOCK *** 3 locks held by kworker/6:1H/155: #0: ((wq_completion)"kblockd"){+.+.}, at: [<00000000116fed84>] process_one_work+0x387/0xa50 #1: ((work_completion)(&q->timeout_work)){+.+.}, at: [<00000000116fed84>] process_one_work+0x387/0xa50 #2: (&(&q->__queue_lock)->rlock){-.-.}, at: [<000000003c5841ec>] blk_timeout_work+0x45/0x220 the dependencies between HARDIRQ-irq-safe lock and the holding lock: -> (&(&q->__queue_lock)->rlock){-.-.} ops: 5585 { IN-HARDIRQ-W at: _raw_spin_lock_irqsave+0x46/0x60 ata_qc_schedule_eh+0xb9/0x110 [libata] ata_sff_hsm_move+0x114/0xb10 [libata] __ata_sff_port_intr+0xec/0x1a0 [libata] ata_bmdma_port_intr+0xef/0x160 [libata] ata_bmdma_interrupt+0x160/0x2e0 [libata] __handle_irq_event_percpu+0x72/0x460 handle_irq_event_percpu+0x66/0xd0 handle_irq_event+0x54/0x90 handle_edge_irq+0xe9/0x2d0 handle_irq+0x17b/0x210 do_IRQ+0x6c/0x140 ret_from_intr+0x0/0x1e native_safe_halt+0x2/0x10 default_idle+0x1f/0x200 do_idle+0x1bc/0x1f0 cpu_startup_entry+0x19/0x20 start_kernel+0x47f/0x4a1 secondary_startup_64+0xa5/0xb0 IN-SOFTIRQ-W at: _raw_spin_lock_irqsave+0x46/0x60 scsi_end_request+0x199/0x310 [scsi_mod] scsi_io_completion+0x3be/0x980 [scsi_mod] blk_done_softirq+0x177/0x1c0 __do_softirq+0x117/0x5f5 irq_exit+0xe8/0xf0 do_IRQ+0xb6/0x140 ret_from_intr+0x0/0x1e native_safe_halt+0x2/0x10 default_idle+0x1f/0x200 do_idle+0x1bc/0x1f0 cpu_startup_entry+0x19/0x20 start_kernel+0x47f/0x4a1 secondary_startup_64+0xa5/0xb0 INITIAL USE at: _raw_spin_lock_irq+0x3b/0x50 blkcg_init_queue+0x97/0x1c0 blk_alloc_queue_node+0x41d/0x4b0 blk_mq_init_queue+0x24/0x60 virtblk_probe+0x633/0xfef [virtio_blk] virtio_dev_probe+0x259/0x380 [virtio] driver_probe_device+0x469/0x680 __driver_attach+0xef/0x120 bus_for_each_dev+0xe4/0x130 bus_add_driver+0x24c/0x380 driver_register+0xc6/0x170 ata_generic_init_one+0x5b/0x260 [ata_generic] do_one_initcall+0x79/0x1b7 do_init_module+0xde/0x32d load_module+0x3964/0x47d0 SYSC_finit_module+0x176/0x1a0 do_syscall_64+0xf3/0x2c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 } ... key at: [<00000000da510125>] __key.53375+0x0/0x40 ... acquired at: _raw_spin_lock+0x2f/0x40 iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] scsi_times_out+0xcc/0x3f0 [scsi_mod] blk_rq_timed_out+0x2f/0x70 blk_timeout_work+0x1b0/0x220 process_one_work+0x446/0xa50 worker_thread+0x7b/0x6d0 kthread+0x1b7/0x1e0 ret_from_fork+0x24/0x30 the dependencies between the lock to be acquired and HARDIRQ-irq-unsafe lock: -> (&(&session->frwd_lock)->rlock){+.-.} ops: 1985 { HARDIRQ-ON-W at: _raw_spin_lock_bh+0x34/0x40 iscsi_conn_setup+0x239/0x320 [libiscsi] iscsi_tcp_conn_setup+0x14/0x80 [libiscsi_tcp] iscsi_sw_tcp_conn_create+0x1f/0x24a [iscsi_tcp] iscsi_if_rx+0x13c9/0x20d0 [scsi_transport_iscsi] netlink_unicast+0x299/0x330 netlink_sendmsg+0x435/0x580 ___sys_sendmsg+0x447/0x4d0 __sys_sendmsg+0xad/0x120 do_syscall_64+0xf3/0x2c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 IN-SOFTIRQ-W at: _raw_spin_lock_bh+0x34/0x40 iscsi_queuecommand+0xef/0x960 [libiscsi] scsi_dispatch_cmd+0x1c7/0x550 [scsi_mod] scsi_request_fn+0x823/0xaf0 [scsi_mod] __blk_run_queue+0xc5/0x160 blk_run_queue+0x48/0x70 scsi_run_queue+0x475/0x5d0 [scsi_mod] scsi_end_request+0x1cd/0x310 [scsi_mod] scsi_io_completion+0x3be/0x980 [scsi_mod] blk_done_softirq+0x177/0x1c0 __do_softirq+0x117/0x5f5 run_ksoftirqd+0x2e/0x50 smpboot_thread_fn+0x314/0x410 kthread+0x1b7/0x1e0 ret_from_fork+0x24/0x30 INITIAL USE at: _raw_spin_lock_bh+0x34/0x40 iscsi_conn_setup+0x239/0x320 [libiscsi] iscsi_tcp_conn_setup+0x14/0x80 [libiscsi_tcp] iscsi_sw_tcp_conn_create+0x1f/0x24a [iscsi_tcp] iscsi_if_rx+0x13c9/0x20d0 [scsi_transport_iscsi] netlink_unicast+0x299/0x330 netlink_sendmsg+0x435/0x580 ___sys_sendmsg+0x447/0x4d0 __sys_sendmsg+0xad/0x120 do_syscall_64+0xf3/0x2c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 } ... key at: [<00000000df8f47b5>] __key.68290+0x0/0xffffffffffff6300 [libiscsi] ... acquired at: _raw_spin_lock+0x2f/0x40 iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] scsi_times_out+0xcc/0x3f0 [scsi_mod] blk_rq_timed_out+0x2f/0x70 blk_timeout_work+0x1b0/0x220 process_one_work+0x446/0xa50 worker_thread+0x7b/0x6d0 kthread+0x1b7/0x1e0 ret_from_fork+0x24/0x30 stack backtrace: CPU: 6 PID: 155 Comm: kworker/6:1H Not tainted 4.16.0-rc7-dbg+ #3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: kblockd blk_timeout_work Call Trace: dump_stack+0x85/0xc5 check_usage+0x6e7/0x700 ? check_usage_forwards+0x220/0x220 ? find_next_and_bit+0x51/0xe0 ? cpumask_next_and+0x20/0x30 ? find_busiest_group+0xc94/0x1010 ? class_equal+0x11/0x20 ? __bfs+0x62/0x2e0 ? class_equal+0x11/0x20 ? __bfs+0xfb/0x2e0 ? __lock_acquire+0x17aa/0x1af0 __lock_acquire+0x17aa/0x1af0 ? mark_lock+0xc7/0x770 ? debug_check_no_locks_freed+0x1b0/0x1b0 ? __lock_acquire+0x583/0x1af0 ? mark_lock+0xc7/0x770 ? lock_pin_lock+0x160/0x160 ? debug_check_no_locks_freed+0x1b0/0x1b0 ? lock_acquire+0xc9/0x260 lock_acquire+0xc9/0x260 ? iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] _raw_spin_lock+0x2f/0x40 ? iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] iscsi_eh_cmd_timed_out+0x6b/0x5a0 [libiscsi] scsi_times_out+0xcc/0x3f0 [scsi_mod] blk_rq_timed_out+0x2f/0x70 blk_timeout_work+0x1b0/0x220 process_one_work+0x446/0xa50 ? pwq_dec_nr_in_flight+0x100/0x100 ? worker_thread+0x177/0x6d0 worker_thread+0x7b/0x6d0 ? process_one_work+0xa50/0xa50 kthread+0x1b7/0x1e0 ? kthread_create_worker_on_cpu+0xc0/0xc0 ret_from_fork+0x24/0x30 [ ... ] ------------[ cut here ]------------ kernel BUG at block/blk-core.c:3267! invalid opcode: 0000 [#1] PREEMPT SMP KASAN Modules linked in: sd_mod crc32c_generic target_core_pscsi target_core_iblock target_core_file iscsi_target_mod target_core_mod brd i2c_piix4 sg virtio_balloon i2c_core af_packet button ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 hid_generic usbhid hid ext4 crc16 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi virtio_blk virtio_net ata_piix xhci_pci xhci_hcd libata virtio_pci usbcore scsi_mod virtio_ring intel_agp usb_common intel_gtt virtio agpgart CPU: 2 PID: 151 Comm: scsi_eh_1 Tainted: G W 4.16.0-rc7-dbg+ #3 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 RIP: 0010:__blk_end_request_all+0xda/0xe0 RSP: 0018:ffff88006192f980 EFLAGS: 00010002 sr 2:0:0:3: rejecting I/O to offline device sr 3:0:0:3: rejecting I/O to offline device RAX: 0000000000000001 RBX: ffff88006818e780 RCX: ffffffff814065a6 RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff88006818e838 RBP: 000000000000000a R08: 0000000000000000 R09: 0000000000000012 R10: ffff88006192f588 R11: 000000005e4786a3 R12: 0000000000000000 R13: 0000000000000000 R14: ffff880061280160 R15: 0000000000000001 FS: 0000000000000000(0000) GS:ffff88006d280000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1fcc53f010 CR3: 00000000666fc000 CR4: 00000000000006e0 Call Trace: blk_peek_request+0x1ff/0x5f0 scsi_request_fn+0x48/0xaf0 [scsi_mod] ? lock_acquire+0xc9/0x260 __blk_run_queue+0xc5/0x160 blk_run_queue+0x48/0x70 scsi_run_queue+0x475/0x5d0 [scsi_mod] ? scsi_io_completion+0x69e/0x980 [scsi_mod] ? sdev_evt_alloc+0x80/0x80 [scsi_mod] ? blk_queue_end_tag+0x10a/0x210 ? __list_add_valid+0x29/0xa0 ? do_raw_spin_unlock+0x91/0x120 scsi_io_completion+0x6a6/0x980 [scsi_mod] ? lock_downgrade+0x200/0x2b0 ? scsi_end_request+0x310/0x310 [scsi_mod] ? scsi_device_unbusy+0x3b/0x60 [scsi_mod] scsi_eh_flush_done_q+0x1a6/0x210 [scsi_mod] ata_scsi_port_error_handler+0x794/0xb00 [libata] ata_scsi_error+0x168/0x1a0 [libata] ? ata_scsi_port_error_handler+0xb00/0xb00 [libata] ? _raw_spin_unlock_irqrestore+0x59/0x70 scsi_error_handler+0x1b5/0xa40 [scsi_mod] ? scsi_eh_get_sense+0x3b0/0x3b0 [scsi_mod] ? __sched_text_start+0x8/0x8 ? __wake_up_common+0x9c/0x230 ? mark_held_locks+0x1c/0x90 ? _raw_spin_unlock_irqrestore+0x59/0x70 ? scsi_eh_get_sense+0x3b0/0x3b0 [scsi_mod] kthread+0x1b7/0x1e0 ? kthread_create_worker_on_cpu+0xc0/0xc0 ret_from_fork+0x24/0x30 Code: 85 c0 75 02 0f 0b 48 89 df e8 b3 99 eb ff 4c 8b 23 e9 6e ff ff ff 0f 0b eb 82 49 8d 7c 24 20 e8 9d 98 eb ff 45 8b 6c 24 20 eb 8c <0f> 0b 0f 1f 40 00 0f 1f 44 00 00 41 57 41 56 41 55 41 54 55 48 RIP: __blk_end_request_all+0xda/0xe0 RSP: ffff88006192f980 ---[ end trace b9c2429e31acedb4 ]---