Am 23.09.2018 um 20:22 schrieb Stefan Priebe - Profihost AG: > > Am 22.09.2018 um 23:40 schrieb Bart Van Assche: >> On 9/18/18 11:10 PM, Stefan Priebe - Profihost AG wrote: >>> after upgrading the aacraid driver / kernel from aacraid 50792 to >>> aacraid 50877. >> >> The aacraid driver version was updated to 50792 in commit 0662cc968ace >> ("scsi: aacraid: Update driver version") and to 50877 in commit >> 1cdb74b80f93 ("scsi: aacraid: Update driver version to 50877"). That >> means that the regression you encountered got introduced after commit >> 0662cc968ace. 114 changes got checked in after that commit. That's too >> much to find the root cause by rereading all these changes. Is there any >> way to trigger the problem faster such that it becomes feasible to run a >> bisect? > > Sadly i'm not able. May be also something else in the kernel has changed. > > I'm now trying the original out of tree driver from microsemi / adaptec: > Adaptec aacraid driver 1.2.1.56008src > > No idea how those driver versions corespond to the kernel ones. OK the out of tree also timed out and the whole system went unreachable. The output just looked different: 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Host adapter abort request (0,0,1,0) 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Timed out Command: 2a 00 c0 98 6f a2 00 00 28 00 00 00 00 00 00 00 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:FIB = ffff950ca6706780 : bac49220 Command = 502 XferState = 830ad Wait Time = 120Sec 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Host adapter abort request (0,0,1,0) 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Timed out Command: 2a 00 c0 98 6d a2 00 02 00 00 00 00 00 00 00 00 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:FIB = ffff950ca67066c8 : bac48a00 Command = 502 XferState = 830ad Wait Time = 120Sec 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Host adapter abort request (0,0,1,0) 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Timed out Command: 2a 00 c0 98 6b a2 00 02 00 00 00 00 00 00 00 00 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:FIB = ffff950ca6706610 : bac481e0 Command = 502 XferState = 830ad Wait Time = 120Sec 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Host adapter abort request (0,0,1,0) 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:Timed out Command: 2a 00 c0 98 69 a2 00 02 00 00 00 00 00 00 00 00 2018-09-24 03:00:29 aacraid 0000:03:00.0: AAC0:aac_eh_abort:FIB = ffff950ca6706558 : bac479c0 Command = 502 XferState = 830ad Wait Time = 120Sec all series 6 controllers are those with problems when high load happens. Greets, Stefan > >> $ git log 0662cc968ace..master drivers/scsi/aacraid | grep -c ^commit >> 114 >> >> Bart.