> -----Original Message----- > From: linux-kernel-owner at vger.kernel.org [mailto:linux-kernel- > owner at vger.kernel.org] On Behalf Of Sreekanth Reddy > Sent: Thursday, August 18, 2016 12:56 AM > Subject: Observing Softlockup's while running heavy IOs > > Problem statement: > Observing softlockups while running heavy IOs on 8 SSD drives > connected behind our LSI SAS 3004 HBA. > ... > Observing a loop in the IO path, i.e only one CPU is busy with > processing the interrupts and other CPUs (in the affinity_hint mask) > are busy with sending the IOs (these CPUs are not yet all receiving > any interrupts). For example, only CPU6 is busy with processing the > interrupts from IRQ 219 and remaining CPUs i.e CPU 7,8,9,10 & 11 are > just busy with pumping the IOs and they never processed any IO > interrupts from IRQ 219. So we are observing softlockups due to > existence this loop in the IO Path. > > We may not observe these softlockups if irqbalancer might have > balanced the interrupts among the CPUs enabled in the particular > irq's > affinity_hint mask. so that all the CPUs are equaly busy with send > IOs > and processing the interrupts. I am not sure how irqbalancer balance > the load among the CPUs, but here I see only one CPU from irq's > affinity_hint mask is busy with interrupts and remaining CPUs won't > receive any interrupts from this IRQ. > > Please help me with any suggestions/recomendations to slove/limit > these kind of softlockups. Also please let me known if I have missed > any setting in the irqbalance. > The CPUs need to be forced to self-throttle by processing interrupts for their own submissions, which reduces the time they can submit more IOs. See https://lkml.org/lkml/2014/9/9/931 for discussion of this problem when blk-mq was added. --- Robert Elliott, HPE Persistent Memory