Hi, On Sat, Feb 10, 2024 at 04:18:31AM +0300, Vitaly Chikunov wrote: > > We started to get timeouts and controller resets since 5.19.5 (vanilla > v5.19 is not tested, tests below are on 6.6.15) when several ioctl > FALLOC_FL_ZERO_RANGE are issued into device consequentially without > delay between them (3-5 is enough to trigger condition). Because of > this, for example, mkfs.ext4 extremely slows down when initializing > filesystem. This happens on aarch64 (Kunpeng-920) server. I am reported that bisect found this commit to cause above mentioned problem: commit c92a6b5d63359dd6d2ce6ea88ecd8e31dd769f6b Author: Martin K. Petersen <martin.petersen@xxxxxxxxxx> AuthorDate: Wed Mar 2 00:35:47 2022 -0500 scsi: core: Query VPD size before getting full page When from v5.19 this commit is reverted the problem disappears. Thanks, > > Reproducer: > > # for ((i=0;i<5;i++)); do echo $i; fallocate -z -l 2097152 /dev/sdc; done > > Example of dmesg messages after problematic ioctl calls: > > Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 Abort request is for SMID: 4753 > Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d51beacc) tm_dev_handle 0x4 > Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 > Feb 06 19:44:07 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 > Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 task abort FAILED!! scmd(0x00000000d51beacc) > Feb 06 19:44:07 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 CDB: Write(10) 2a 00 00 00 00 00 00 00 08 00 > Feb 06 19:45:04 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 Abort request is for SMID: 8293 > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting task abort! scmd(0x00000000d9406c9c) tm_dev_handle 0x4 > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 BRCM Debug mfi stat 0x2d, data len requested/completed 0x1000/0x0 > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 task abort SUCCESS!! scmd(0x00000000d9406c9c) > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#8292 CDB: Write Same(10) 41 00 03 4c 00 10 00 10 00 00 > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: attempting target reset! scmd(0x00000000d51beacc) tm_dev_handle: 0x4 > Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009 > Feb 06 19:45:06 host-226 kernel: megaraid_sas 0000:01:00.0: megasas_enable_intr_fusion is called outbound_intr_mask:0x40000000 > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: [sdc] tag#4752 target reset SUCCESS!! > Feb 06 19:45:06 host-226 kernel: sd 0:2:4:0: Power-on or device reset occurred > > Excerpt from the controller events log (from storli): > > Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004 reset (Type 03) > Event Description: Drive PD 05(e0xfb/s4) link speed changed > Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05 > Event Description: Unexpected sense: Encl PD fb Path 5e8b4700e35e201e, CDB: 3c 01 05 00 00 00 00 00 10 00, Sense: b/4b/05 > Event Description: PD 05(e0xfb/s4) Path 5e8b4700e35e2004 reset (Type 03) > Event Description: Drive PD 05(e0xfb/s4) link speed changed > Event Description: Unexpected sense: PD 05(e0xfb/s4) Path 5e8b4700e35e2004, CDB: 41 00 00 00 00 00 00 10 00 00, Sense: 6/29/00 > > Tests was on the latest firmware (at the moment): > > Product Name = MegaRAID 9560-8i 4GB > Serial Number = SKC4006982 > Firmware Package Build = 52.28.0-5305 > Firmware Version = 5.280.02-3972 > PSOC FW Version = 0x001A > PSOC Hardware Version = 0x000A > PSOC Part Number = 29211-260-4GB > NVDATA Version = 5.2800.00-0752 > CBB Version = 28.250.04.00 > Bios Version = 7.28.00.0_0x071C0000 > HII Version = 07.28.04.00 > HIIA Version = 07.28.04.00 > Driver Name = megaraid_sas > Driver Version = 07.725.01.00-rc1 > > I tried also latest available megaraid_sas driver (07.728.04.00) which is not > yet merged into mainline but the problems are not resolved with it. > > Thanks, >