https://bugzilla.kernel.org/show_bug.cgi?id=216964 Bug ID: 216964 Summary: LSI SAS1068 logical volume caching mode not detected (with patch) Product: SCSI Drivers Version: 2.5 Kernel Version: 5.15.58 Hardware: All OS: Linux Tree: Mainline Status: NEW Severity: normal Priority: P1 Component: Other Assignee: scsi_drivers-other@xxxxxxxxxxxxxxxxxxxx Reporter: michal.ruza@xxxxxxxxx Regression: No Created attachment 303641 --> https://bugzilla.kernel.org/attachment.cgi?id=303641&action=edit Patch to use fixed size buffer for the buggy MODE SENSE command Hardware: HBA card: Broadcom / LSI SAS1068 PCI-X Fusion-MPT SAS PCI VID:PID: 1000:0054 Problem: Caching mode of logical volumes managed by the controller in question is not detected. Relevant kernel messages: [ 19.642388] scsi 8:1:0:0: Direct-Access LSILOGIC Logical Volume 3000 PQ: 0 ANSI: 2 [ 19.649179] sd 8:1:0:0: Attached scsi generic sg6 type 0 [ 19.649390] sd 8:1:0:0: [sdd] 583983104 512-byte logical blocks: (299 GB/278 GiB) [ 19.649625] sd 8:1:0:0: [sdd] Write Protect is off [ 19.649629] sd 8:1:0:0: [sdd] Mode Sense: 03 00 00 08 [ 19.649837] sd 8:1:0:0: [sdd] No Caching mode page found [ 19.649853] sd 8:1:0:0: [sdd] Assuming drive cache: write through [ 19.666881] sdd: sdd1 sdd2 sdd3 [ 19.667776] sd 8:1:0:0: [sdd] Attached SCSI disk Cause of the problem: The SCSI MODE SENSE command is broken for the logical volumes managed by the controller in question in that it does not set the length field in the returned response to the length of the entire response but rather only to the length of the portion of the response actually written to the provided buffer (which is obviously limited by the length of the provided buffer). This breaks the logic in sd_read_cache_type [1] which first tries to determine the size of the entire response by executing the MODE SENSE command with a small buffer and then uses the length field from the partial response to size the buffer for the entire response appropriately. This does not work for the logical volumes managed by the controller in question as for them the reported response length is never greater than the length of the provided buffer (in fact it is always 3 as evidenced by the first byte in the "Mode Sense:" log message - which is the length of the small buffer provided to the MODE SENSE command less the length byte itself), so the response is never received in its entirety, which leads to the caching mode detection failure. The problem can be demonstrated by the sg_modes command: - invoked on the misbehaving logical volume: # sg_modes -6 -p 8 -m 4 -d -H /dev/sdd LSILOGIC Logical Volume 3000 peripheral_type: disk [0x0] 00 03 00 00 00 - invoked on a correctly behaving disk: # sg_modes -6 -p 8 -m 4 -d -H /dev/sda ATA WDC WD4003FFBX-6 0A83 peripheral_type: disk [0x0] 00 17 00 00 00 Notice the difference in the length field - the first byte of the response. Nevertheless the misbehaving logical volume _does_ report the caching mode correctly when the relevant MODE SENSE command is executed with large enough buffer. Again this can be demonstrated by the sg_modes command: # sg_modes -6 -p 8 -m 192 -d -H /dev/sdd LSILOGIC Logical Volume 3000 peripheral_type: disk [0x0] 00 17 00 00 00 08 12 04 00 ff ff 00 00 ff 00 ff ff 10 00 0f 00 00 00 00 00 00 Possible fix: It turns out there is already a flag in struct scsi_device which forces the relevant MODE SENSE command to be executed with a 192 bytes long buffer (which is long enough to hold the entire response): use_192_bytes_for_3f [2]. When this flag is set for the misbehaving disk/logical volume (together with the skip_ms_page_8 flag), the caching mode detection works correctly. This has been verified by applying the attached patch. With the patch applied, the relevant kernel messages look like this: [ 19.263001] scsi 8:1:0:0: Direct-Access LSILOGIC Logical Volume 3000 PQ: 0 ANSI: 2 [ 19.263190] sd 8:1:0:0: Attached scsi generic sg6 type 0 [ 19.263413] sd 8:1:0:0: [sdd] 583983104 512-byte logical blocks: (299 GB/278 GiB) [ 19.263690] sd 8:1:0:0: [sdd] Write Protect is off [ 19.263694] sd 8:1:0:0: [sdd] Mode Sense: 67 00 00 08 [ 19.263970] sd 8:1:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 19.279922] sdd: sdd1 sdd2 sdd3 [ 19.280904] sd 8:1:0:0: [sdd] Attached SCSI disk [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/scsi/sd.c?h=v5.15.58#n2687 https://elixir.bootlin.com/linux/v5.15.58/source/drivers/scsi/sd.c#L2687 [2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/include/scsi/scsi_device.h?h=v5.15.58#n184 https://elixir.bootlin.com/linux/v5.15.58/source/include/scsi/scsi_device.h#L184 -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.