On 2024/08/07 10:23, Christian Heusel wrote: > Hello Igor, hello Niklas, > > on my NAS I am encountering the following issue since v6.6.44 (LTS), > when executing the hdparm command for my WD-WCC7K4NLX884 drives to get > the active or standby state: > > $ hdparm -C /dev/sda > /dev/sda: > SG_IO: bad/missing sense data, sb[]: f0 00 01 00 50 40 ff 0a 00 00 78 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > drive state is: unknown > > > While the expected output is the following: > > $ hdparm -C /dev/sda > /dev/sda: > drive state is: active/idle > > I did a bisection within the stable series and found the following > commit to be the first bad one: > > 28ab9769117c ("ata: libata-scsi: Honor the D_SENSE bit for CK_COND=1 and no error") > > According to kernel.dance the same commit was also backported to the > v6.10.3 and v6.1.103 stable kernels and I could not find any commit or > pending patch with a "Fixes:" tag for the offending commit. > > So far I have not been able to test with the mainline kernel as this is > a remote device which I couldn't rescue in case of a boot failure. Also > just for transparency it does have the out of tree ZFS module loaded, > but AFAIU this shouldn't be an issue here, as the commit seems clearly > related to the error. If needed I can test with an untainted mainline > kernel on Friday when I'm near the device. > > I have attached the output of hdparm -I below and would be happy to > provide further debug information or test patches. I confirm this, using 6.11-rc2. The problem is actually hdparm code which assumes that the sense data is in descriptor format without ever looking at the D_SENSE bit to verify that. So commit 28ab9769117c reveals this issue because as its title explains, it (correctly) honors D_SENSE instead of always generating sense data in descriptor format. Hmm... This is annoying. The kernel is fixed to be spec compliant but that breaks old/non-compliant applications... We definitely should fix hdparm code, but I think we still need to revert 28ab9769117c... Niklas, Igor, thoughts ? > > Cheers, > Christian > > --- > > #regzbot introduced: 28ab9769117c > #regzbot title: ata: libata-scsi: Sense data errors breaking hdparm with WD drives > > --- > > $ pacman -Q hdparm > hdparm 9.65-2 > > $ hdparm -I /dev/sda > > /dev/sda: > > ATA device, with non-removable media > Model Number: WDC WD40EFRX-68N32N0 > Serial Number: WD-WCC7K4NLX884 > Firmware Revision: 82.00A82 > Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 > Standards: > Used: unknown (minor revision code 0x006d) > Supported: 10 9 8 7 6 5 > Likely used: 10 > Configuration: > Logical max current > cylinders 16383 0 > heads 16 0 > sectors/track 63 0 > -- > LBA user addressable sectors: 268435455 > LBA48 user addressable sectors: 7814037168 > Logical Sector size: 512 bytes > Physical Sector size: 4096 bytes > Logical Sector-0 offset: 0 bytes > device size with M = 1024*1024: 3815447 MBytes > device size with M = 1000*1000: 4000787 MBytes (4000 GB) > cache/buffer size = unknown > Form Factor: 3.5 inch > Nominal Media Rotation Rate: 5400 > Capabilities: > LBA, IORDY(can be disabled) > Queue depth: 32 > Standby timer values: spec'd by Standard, with device specific minimum > R/W multiple sector transfer: Max = 16 Current = 16 > DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6 > Cycle time: min=120ns recommended=120ns > PIO: pio0 pio1 pio2 pio3 pio4 > Cycle time: no flow control=120ns IORDY flow control=120ns > Commands/features: > Enabled Supported: > * SMART feature set > Security Mode feature set > * Power Management feature set > * Write cache > * Look-ahead > * Host Protected Area feature set > * WRITE_BUFFER command > * READ_BUFFER command > * NOP cmd > * DOWNLOAD_MICROCODE > Power-Up In Standby feature set > * SET_FEATURES required to spinup after power up > SET_MAX security extension > * 48-bit Address feature set > * Device Configuration Overlay feature set > * Mandatory FLUSH_CACHE > * FLUSH_CACHE_EXT > * SMART error logging > * SMART self-test > * General Purpose Logging feature set > * 64-bit World wide name > * IDLE_IMMEDIATE with UNLOAD > * WRITE_UNCORRECTABLE_EXT command > * {READ,WRITE}_DMA_EXT_GPL commands > * Segmented DOWNLOAD_MICROCODE > * Gen1 signaling speed (1.5Gb/s) > * Gen2 signaling speed (3.0Gb/s) > * Gen3 signaling speed (6.0Gb/s) > * Native Command Queueing (NCQ) > * Host-initiated interface power management > * Phy event counters > * Idle-Unload when NCQ is active > * NCQ priority information > * READ_LOG_DMA_EXT equivalent to READ_LOG_EXT > * DMA Setup Auto-Activate optimization > * Device-initiated interface power management > * Software settings preservation > * SMART Command Transport (SCT) feature set > * SCT Write Same (AC2) > * SCT Error Recovery Control (AC3) > * SCT Features Control (AC4) > * SCT Data Tables (AC5) > unknown 206[12] (vendor specific) > unknown 206[13] (vendor specific) > * DOWNLOAD MICROCODE DMA command > * WRITE BUFFER DMA command > * READ BUFFER DMA command > Security: > Master password revision code = 65534 > supported > not enabled > not locked > frozen > not expired: security count > supported: enhanced erase > 504min for SECURITY ERASE UNIT. 504min for ENHANCED SECURITY ERASE UNIT. > Logical Unit WWN Device Identifier: 50014ee2647735a1 > NAA : 5 > IEEE OUI : 0014ee > Unique ID : 2647735a1 > Checksum: correct -- Damien Le Moal Western Digital Research