On Tue, Jul 04, 2023 at 07:04:00PM +0200, Marc Hartmayer wrote: > On Thu, Jun 22, 2023 at 12:01 AM +0800, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote: > > From: Yu Kuai <yukuai3@xxxxxxxxxx> > > diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c > > index 2433eeef042a..dcb73787c29d 100644 > > --- a/drivers/scsi/sg.c > > +++ b/drivers/scsi/sg.c > > @@ -1497,7 +1497,7 @@ sg_add_device(struct device *cl_dev) > > int error; > > unsigned long iflags; > > > > - error = scsi_device_get(scsidp); > > + error = blk_get_queue(scsidp->request_queue); > > if (error) > > return error; > > Might be interesting as well. Marc showed me a `dmesg` snipped earlier from when the bind fails: [ 15.441817] scsi host2: scsi_eh_2: sleeping [ 15.441899] scsi_debug:sdebug_driver_probe: scsi_debug: trim poll_queues to 0. poll_q/nr_hw = (0/1) [ 15.441907] scsi host2: scsi_debug: version 0191 [20210520] dev_size_mb=8, opts=0x0, submit_queues=1, statistics=0 [ 15.442078] scsi host2: scsi_scan_host_selected: <4294967295:4294967295:18446744073709551615> [ 15.442267] scsi 2:0:0:0: scsi scan: INQUIRY pass 1 length 36 [ 15.442286] scsi 2:0:0:0: scsi scan: INQUIRY successful with code 0x0 [ 15.442296] scsi 2:0:0:0: scsi scan: INQUIRY pass 2 length 96 [ 15.442308] scsi 2:0:0:0: scsi scan: INQUIRY successful with code 0x0 [ 15.442317] scsi 2:0:0:0: Direct-Access Linux scsi_debug 0191 PQ: 0 ANSI: 7 [ 15.442554] scsi 2:0:0:0: Power-on or device reset occurred [ 15.442560] scsi 2:0:0:0: tag#50 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s [ 15.442565] scsi 2:0:0:0: tag#50 CDB: Report supported operation codes a3 0c 01 88 00 00 00 00 00 14 00 00 [ 15.442569] scsi 2:0:0:0: tag#50 Sense Key : Unit Attention [current] [ 15.442573] scsi 2:0:0:0: tag#50 Add. Sense: Power on occurred The bind should happend around here somewhere I think. [ 15.472680] sd 2:0:0:0: scsi scan: Sending REPORT LUNS to (try 0) [ 15.472703] sd 2:0:0:0: scsi scan: REPORT LUNS successful (try 0) result 0x0 [ 15.472706] sd 2:0:0:0: scsi scan: REPORT LUN scan [ 15.472709] sd 2:0:0:0: scsi scan: device exists on 2:0:0:0 [ 15.492874] sd 2:0:0:0: [sdi] 16384 512-byte logical blocks: (8.39 MB/8.00 MiB) [ 15.502853] sd 2:0:0:0: [sdi] Write Protect is off [ 15.502856] sd 2:0:0:0: [sdi] Mode Sense: 73 00 10 08 [ 15.522819] sd 2:0:0:0: [sdi] Write cache: enabled, read cache: enabled, supports DPO and FUA [ 15.552773] sd 2:0:0:0: [sdi] Preferred minimum I/O size 512 bytes [ 15.552776] sd 2:0:0:0: [sdi] Optimal transfer size 524288 bytes [ 15.575373] sd 2:0:0:0: [sdi] tag#62 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s [ 15.575377] sd 2:0:0:0: [sdi] tag#62 CDB: Inquiry 12 01 b9 00 04 00 [ 15.575380] sd 2:0:0:0: [sdi] tag#62 Sense Key : Illegal Request [current] [ 15.575383] sd 2:0:0:0: [sdi] tag#62 Add. Sense: Invalid field in cdb [ 15.645749] sd 2:0:0:0: [sdi] Attached SCSI disk But we don't even see the `sg_alloc: dev=...` message that is logged in `sg_alloc()`. And between the change above and the call to `sg_alloc()`, there is only the character device allocation; and if that failed, it would print an error. So either the bind is never even tried, or the new `blk_get_queue()` fails to get a reference. Which is odd, since the only way that would happen is, if the queue was marked dying; but we see that the stack is using it for LUN probing in `sd`. > This change (bisected) triggers a regression in our KVM on s390x CI. The > symptom is that a “scsi_debug device” does not bind to the scsi_generic > driver. On s390x you can reproduce the problem as follows (I have not > tested on x86): > > With this patch applied: > > $ sudo modprobe scsi_debug One more thing maybe worth mentioning: in the kernel configuration we use in the CI we have `sg` built-in. I guess most have it built as module. > $ # Get the 'scsi_host,channel,target_number,LUN' tuple for the scsi_debug device > $ lsscsi |grep scsi_debug |awk '{ print $1 }' > [0:0:0:0] > $ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic > stat: cannot statx '/sys/bus/scsi/devices/0:0:0:0/scsi_generic': No such file or directory > > > Patch reverted: > > $ sudo modprobe scsi_debug > $ lsscsi |grep scsi_debug |awk '{ print $1 }' > [0:0:0:0] > $ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic > File: /sys/bus/scsi/devices/0:0:0:0/scsi_generic > Size: 0 Blocks: 0 IO Block: 4096 directory > Device: 0,20 Inode: 12155 Links: 3 > … That's all I got from looking at it earlier, so far. -- Best Regards, Benjamin Block / Linux on IBM Z Kernel Development IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy Vors. Aufs.-R.: Gregor Pillen / Geschäftsführung: David Faller Sitz der Ges.: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294