For a long time I noticed that at boot time I often see disk errors, but later on all seems well. Below is an example of relevant log messages after a boot. Initially things are normal for all (7) disks in the array, then there is a burst of messages for sdb, including two resets. I marked the sdb messages. It is as if this one disk takes longer to come up. I see this on three disks but not on the other four (all are the same model, Seagate ST12000NM0007 [Yes, I know]). I wonder if this situation can be related to the controller (LSISAS2008) or maybe the cabling. Four cables attach to a socket (there are two on this controller) and only three of the disks on one bundle show the problem and not the fourth, and none of the three on the second bundle have issues. Then again it may indicate a disk issue, and an RMA is due? I regularly run an "Extended offline" test and it is always successful. Or maybe some timeout is too short (can I set it?). Following such an incident I see smartctl reporting an increase in Command_Timeout and UDMA_CRC_Error_Count. TIA Eyal ================ log start ============== 2023-05-05T17:15:44+1000 kernel: Linux version 6.2.14-100.fc36.x86_64 (mockbuild@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx) (gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4), GNU ld version 2.37-37.fc36) #1 SMP PREEMPT_DYNAMIC Mon May 1 00:54:35 UTC 2023 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: 64 BIT PCI BUS DMA ADDRESSING SUPPORTED, total mem (32705204 kB) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: MSI-X vectors supported: 1 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: 0 1 1 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: High IOPs queues : disabled 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: iomem(0x00000000514c0000), mapped(0x00000000d8efeca3), size(16384) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: ioport(0x0000000000004000), size(256) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: CurrentHostPageSize is 0: Setting default host page size to 4k 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: scatter gather: sge_in_main_msg(1), sge_per_chain(9), sge_per_io(128), chains_per_io(15) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: request pool(0x000000003049b737) - dma(0x111800000): depth(3492), frame_size(128), pool_size(436 kB) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: sense pool(0x000000008e6843eb) - dma(0x111f00000): depth(3367), element_size(96), pool_size (315 kB) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: reply pool(0x00000000acd81aaa) - dma(0x111f80000): depth(3556), frame_size(128), pool_size(444 kB) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: config page(0x00000000c56162d9) - dma(0x111eb5000): size(512) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Allocated physical memory: size(7579 kB) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Current Controller Queue Depth(3364),Max Controller Queue Depth(3432) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Scatter Gather Elements per IO(128) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00) 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: Protocol=(Initiator,Target 2023-05-05T17:15:45+1000 kernel: mpt2sas_cm0: sending port enable !! 2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: hba_port entry: 00000000e9b01ff1, port: 255 is added to hba_port list 2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: host_add: handle(0x0001), sas_addr(0x500605b0013ca580), phys(8) 2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: handle(0x9) sas_address(0x4433221100000000) port_type(0x1) 2023-05-05T17:15:47+1000 kernel: mpt2sas_cm0: handle(0xa) sas_address(0x4433221101000000) port_type(0x1) 2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xb) sas_address(0x4433221102000000) port_type(0x1) 2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xc) sas_address(0x4433221103000000) port_type(0x1) 2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xd) sas_address(0x4433221105000000) port_type(0x1) 2023-05-05T17:15:48+1000 kernel: mpt2sas_cm0: handle(0xe) sas_address(0x4433221106000000) port_type(0x1) 2023-05-05T17:15:49+1000 kernel: mpt2sas_cm0: handle(0xf) sas_address(0x4433221107000000) port_type(0x1) 2023-05-05T17:15:53+1000 kernel: mpt2sas_cm0: port enable: SUCCESS 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Attached scsi generic sg2 type 0 <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: Attached scsi generic sg3 type 0 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: Attached scsi generic sg4 type 0 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: Attached scsi generic sg5 type 0 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] 4096-byte physical blocks <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: Attached scsi generic sg6 type 0 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: Attached scsi generic sg7 type 0 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: Power-on or device reset occurred 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] 23437770752 512-byte logical blocks: (12.0 TB/10.9 TiB) 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] 4096-byte physical blocks 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Write Protect is off 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Mode Sense: 7f 00 10 08 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Write cache: enabled, read cache: enabled, supports DPO and FUA 2023-05-05T17:15:53+1000 kernel: sdd: sdd1 2023-05-05T17:15:53+1000 kernel: sdh: sdh1 2023-05-05T17:15:53+1000 kernel: sd 6:0:2:0: [sdd] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: sdg: sdg1 2023-05-05T17:15:53+1000 kernel: sd 6:0:5:0: [sdg] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: sd 6:0:6:0: [sdh] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: sdc: sdc1 2023-05-05T17:15:53+1000 kernel: sd 6:0:1:0: [sdc] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: sdf: sdf1 2023-05-05T17:15:53+1000 kernel: sd 6:0:4:0: [sdf] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: sde: sde1 2023-05-05T17:15:53+1000 kernel: sd 6:0:3:0: [sde] Attached SCSI disk 2023-05-05T17:15:53+1000 kernel: mpt2sas_cm0: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01) <<<<< start 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred <<<<< 2023-05-05T17:15:53+1000 kernel: sdb: sdb1 <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Attached SCSI disk <<<<< 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] Unaligned partial completion (resid=1020, sector_sz=512) 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 CDB: Read(16) 88 00 00 00 00 05 74 ff ff 80 00 00 00 08 00 00 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 Sense Key : Aborted Command [current] 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 Add. Sense: Information unit iuCRC error detected 2023-05-05T17:15:53+1000 kernel: sd 6:0:0:0: [sdb] tag#33 CDB: Read(16) 88 00 00 00 00 05 74 ff ff 80 00 00 00 08 00 00 2023-05-05T17:15:53+1000 kernel: I/O error, dev sdb, sector 23437770624 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] Unaligned partial completion (resid=1020, sector_sz=512) 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 CDB: Read(16) 88 00 00 00 00 05 74 ff fe 70 00 00 00 08 00 00 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 Sense Key : Aborted Command [current] 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 Add. Sense: Information unit iuCRC error detected 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#42 CDB: Read(16) 88 00 00 00 00 05 74 ff fe 70 00 00 00 08 00 00 2023-05-05T17:15:54+1000 kernel: I/O error, dev sdb, sector 23437770352 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 2023-05-05T17:15:54+1000 kernel: mpt2sas_cm0: log_info(0x31110d01): originator(PL), code(0x11), sub_code(0x0d01) 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#51 FAILED Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK cmd_age=0s 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: [sdb] tag#51 CDB: Read(16) 88 00 00 00 00 05 74 ff f3 f0 00 00 00 08 00 00 2023-05-05T17:15:54+1000 kernel: I/O error, dev sdb, sector 23437767664 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 2 2023-05-05T17:15:54+1000 kernel: sd 6:0:0:0: Power-on or device reset occurred <<<<< end 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdh1 operational as raid disk 6 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdf1 operational as raid disk 4 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdb1 operational as raid disk 0 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdd1 operational as raid disk 2 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdc1 operational as raid disk 1 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sdg1 operational as raid disk 5 2023-05-05T17:16:01+1000 kernel: md/raid:md127: device sde1 operational as raid disk 3 2023-05-05T17:16:01+1000 kernel: md/raid:md127: raid level 6 active with 7 out of 7 devices, algorithm 2 2023-05-05T17:16:01+1000 kernel: md127: detected capacity change from 0 to 117187522560 2023-05-05T17:16:03+1000 kernel: EXT4-fs (md127): mounted filesystem 378e74a6-e379-4bd5-ade5-f3cd85952099 with ordered data mode. Quota mode: none. -- Eyal Lebedinsky (fedora@xxxxxxxxxxxxxx) _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue