https://bugzilla.kernel.org/show_bug.cgi?id=199435 --- Comment #17 from Don (don.brace@xxxxxxxxxxxxx) --- (In reply to Anthony Hausman from comment #16) > Don, > > So I'm actually running the kernel 4.16.3 (build 18-04-19) with the hpsa > modules patch to use local work-queue insead of system work-queue. > > I have a reproduce a reset with no stack trace (which is a good news). > The only thing is between the resetting logical and the completation, 2 > hours passed and caused an heavy load on the server during this time: > > Apr 25 01:31:09 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical > Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 > Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: device is ready. > Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: reset logical > completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 > SSDSmartPathCap- En- Exp=1 > > The good thing after the reset has completed, this one is removed: > > Apr 25 03:31:45 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: removed > Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 The driver was notified by the P420i that the volume went offline, so the driver removed it from SML. > Apr 25 03:31:48 kernel: scsi 0:1:0:0: rejecting I/O to dead device There were I/O requests for the device, but the SML detected that it was deleted. > > So the question is if it's normal than the reset logical take such a long > time (and causing trouble on the server)? It is not normal. For a Logical Volume reset, the P420i flushes out any outstanding I/O requests then returns. The SML should block any new requests from coming down while the reset is in progress. Do you know what process was consuming the CPU cycles? ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 Are your using sg_reset to test LV resets? Or, does the device have some intermittent issues which is causing the SML to issue the reset operation? If you turn off the agents, do the resets complete more quickly? I am wondering if the agents are frequently probing the P420i for changes when the reset is active and the agents are consuming the CPU cycles. -- You are receiving this mail because: You are the assignee for the bug.