Optical drive (CD/DVD/Blu-ray), connected via USB, passed-through SCSI, distributed via iSCSI, error messages in dmesg every second, slower average read speed at 2 MB/s, normal behavior with Microsoft Initiator

ffp@xxxxxxxxx · Sun, 11 Feb 2024 19:29:10 +0000

[1.] One line summary of the problem:

Optical drive (CD/DVD/Blu-ray), connected via USB, passed-through SCSI, 
distributed via iSCSI, error messages in dmesg every second, slower 
average read speed at 2 MB/s, normal behavior with Microsoft Initiator

[2.] Full description of the problem/report:

Hello everyone,

I hope I am writing to the right people and describing the problem 
reasonably well, this is my first bug report. I have been trying to find 
a solution at program level for two months now, which is unfortunately 
proving difficult. I think that the error can be narrowed down to the 
kernel, there are indications of this, but I can't prove it with 100% 
certainty as I don't understand the C language myself.

Let me break down the current structure:
optical drive---> via USB3---> ESXi8 host---> pass-through to VM---> 
Debian 12 with kernel 6.1.0 and self-compiled kernel 6.7.4--->tgt iSCSI 
target---> SCSI pass-through---> Ethernet/ WiFi with TCP---> Debian 12 
Stable or Testing with open-iscsi Initiator---> Programs

(drive already removed from enclosure and connected to physical PC with 
mini-SATA, tgt set up, same error pattern)

Problem description:

I search for the target via iscsiadm and log in. The optical drive is 
recognized and initialized, see dmesg:

[ 6204.436754] scsi host0: iSCSI Initiator over TCP/IP
[ 6204.452976] scsi 0:0:0:0: RAID              IET      Controller       
0001 PQ: 0 ANSI: 5
[ 6204.483159] scsi 0:0:0:0: Attached scsi generic sg0 type 12
[ 6204.486988] scsi 0:0:0:1: CD-ROM            NECVMWar VMware SATA CD00 
1.00 PQ: 0 ANSI: 5
[ 6204.534459] sr 0:0:0:1: [sr0] scsi-1 drive
[ 6204.598759] sr 0:0:0:1: Attached scsi CD-ROM sr0
[ 6204.598905] sr 0:0:0:1: Attached scsi generic sg1 type 5
[ 6208.550055] sr 0:0:0:1: [sr0] tag#96 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK cmd_age=0s
[ 6208.550064] sr 0:0:0:1: [sr0] tag#96 Sense Key : Hardware Error 
[current]
[ 6208.550067] sr 0:0:0:1: [sr0] tag#96 Add. Sense: Internal target 
failure
[ 6208.550071] sr 0:0:0:1: [sr0] tag#96 CDB: Read(10) 28 00 00 00 00 82 
00 00 7e 00
[ 6208.550073] critical target error, dev sr0, sector 520 op 0x0:(READ) 
flags 0x80700 phys_seg 63 prio class 2
[ 6208.763308] sr 0:0:0:1: [sr0] tag#96 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK cmd_age=0s
[ 6208.763316] sr 0:0:0:1: [sr0] tag#96 Sense Key : Hardware Error 
[current]
[ 6208.763319] sr 0:0:0:1: [sr0] tag#96 Add. Sense: Internal target 
failure
[ 6208.763322] sr 0:0:0:1: [sr0] tag#96 CDB: Read(10) 28 00 01 65 d3 00 
00 00 80 00
[ 6208.763324] critical target error, dev sr0, sector 93801472 op 
0x0:(READ) flags 0x80700 phys_seg 64 prio class 2
[ 6212.108642] sr 0:0:0:1: [sr0] tag#98 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_OK cmd_age=0s
[ 6212.108650] sr 0:0:0:1: [sr0] tag#98 Sense Key : Hardware Error 
[current]
[ 6212.108653] sr 0:0:0:1: [sr0] tag#98 Add. Sense: Internal target 
failure
[ 6212.108656] sr 0:0:0:1: [sr0] tag#98 CDB: Read(10) 28 00 01 65 d3 80 
00 00 80 00
[ 6212.108658] critical target error, dev sr0, sector 93801984 op 
0x0:(READ) flags 0x80700 phys_seg 64 prio class 2

At first it seems as if it is recognized and initialized without any 
problems, about 3-5 seconds after connection the first error messages 
are thrown.
For example, when I run rsync -av -P on the mounted drive, dmesg is 
flooded with this kind of error message and the transfer rate is 2.0 - 
2.5 MB/s. This behavior does NOT continue to occur when I use dd 
if=/dev/sr0 of=/test.img to read the disc bit by bit. To my surprise, 
the disc is then read at the full speed of the drive (between 20.0 and 
24.0 MB/s).

In principle, this is also the complete error pattern, with open-iscsi 
as the initiator.

I have now run some tests to find a solution at program level, which has 
not been found.
Among other things, I have counter-tested with Windows (10/11/Server 
2022) and the initiator used there can establish a connection with 
default settings and mount the drive into the system. Completely 
error-free. The programs there can read it at full speed.

I have also tested another Debian-based distribution (virtualized), 
Ubuntu 22.04, which also has this error.I have also tested various 
kernel versions, namely kernel 6.7.4 (self-compiled), 6.6.13, 6.5.0, 
4.19.306 and 4.0.0 (Debian Stretch Alpha).
This error occurs in all kernel versions mentioned. In the first four, 
exactly the same. With kernel 4.0.0 the drive is initialized in dmesg 
apparently without errors, only when I start a read operation (e.g. with 
rsync) is dmesg flooded.
I also read the disc on the VM on which tgt is running with dd 1:1 and 
wrote it to a hard disk. I then entered the hard disk in targets.conf 
using SCSI passthrough and mounted it on the client with 
open-iscsi-initiator.
Now I have used various programs to read out this created image on the 
client and copied it to file level. This worked without any problems, 
neither in dmesg nor with the speed, which means for me that the error 
is not generally due to open-iscsi.
On Friday I bought a brand new optical drive which also works fine on 
Windows to rule out the error of an unlikely hardware defect.

IMPORTANT NOTE: Both drives work flawlessly when I connect them to my 
Linux clients via USB(2/3) OR pass them through to a guest VM via ESXi8.

How can the error be reproduced?

Set up tgt, connect a DVD/Blu-ray-capable drive via USB/SATA.
Enter it in /etc/tgt/conf.d/targets.conf:

default-driver iscsi

<target iqn.1993-08.org.brd-srv:vbrd.target1>
  <backing-store "/dev/sg1">
      device-type pt
      bs-type sg
  </backing-store>
</target>

Restart tgt: systemctl restart tgt

On the client:

iscsiadm --mode discovery --portal target_ip --type sendtargets
iscsiadm -m node --targetname=targetname --login

The configuration file: /etc/iscsi/node/target-iqn/ip-address/default
can be left unchanged, the change of various parameters did not affect 
the error.

Mount the drive and the disk:
mount /dev/sr0 /cdrom/

Start a copy process such as rsync -av -P /cdrom /home/user/disc/
and run dmesg -w at the same time.

I hope this report is enough to get you started, please contact me if 
you need more information.

Best regards

[3.] Keywords (i.e., modules, networking, kernel):

Networking, SCSI, iSCSI, Kernel, cdrom, sr, tgt

[4.] Kernel information
[4.1.] Kernel version (from /proc/version):
[4.2.] Kernel .config file:
[5.] Most recent kernel version which did not have the bug:
[6.] Output of Oops.. message (if applicable) with symbolic information
     resolved (see Documentation/admin-guide/oops-tracing.rst)
[7.] A small shell script or example program which triggers the
     problem (if possible)
[8.] Environment
[8.1.] Software (add the output of the ver_linux script here)
[8.2.] Processor information (from /proc/cpuinfo):
[8.3.] Module information (from /proc/modules):
[8.4.] Loaded driver and hardware information (/proc/ioports, 
/proc/iomem)
[8.5.] PCI information ('lspci -vvv' as root)
[8.6.] SCSI information (from /proc/scsi/scsi)
[8.7.] Other information that might be relevant to the problem
       (please look in /proc and include all information that you
       think to be relevant):