> -----Original Message----- > From: Linux-nvme [mailto:linux-nvme-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf > Of Hannes Reinecke > Sent: Monday, January 29, 2018 3:09 AM > To: lsf-pc@xxxxxxxxxxxxxxxxxxxxxxxxxx > Cc: linux-nvme@xxxxxxxxxxxxxxxxxxx; linux-scsi@xxxxxxxxxxxxxxx; Kashyap > Desai <kashyap.desai@xxxxxxxxxxxx> > Subject: [LSF/MM TOPIC] irq affinity handling for high CPU count machines > > Hi all, > > here's a topic which came up on the SCSI ML (cf thread '[RFC 0/2] > mpt3sas/megaraid_sas: irq poll and load balancing of reply queue'). > > When doing I/O tests on a machine with more CPUs than MSIx vectors > provided by the HBA we can easily setup a scenario where one CPU is > submitting I/O and the other one is completing I/O. Which will result in > the latter CPU being stuck in the interrupt completion routine for > basically ever, resulting in the lockup detector kicking in. > > How should these situations be handled? > Should it be made the responsibility of the drivers, ensuring that the > interrupt completion routine is terminated after a certain time? > Should it be made the resposibility of the upper layers? > Should it be the responsibility of the interrupt mapping code? > Can/should interrupt polling be used in these situations? Back when we introduced scsi-mq with hpsa, the best approach was to route interrupts and completion handling so each CPU core handles its own submissions; this way, they are self-throttling. Every other arrangement was subject to soft lockups and other problems when the completion CPUs become overwhelmed with work. See https://lkml.org/lkml/2014/9/9/931. --- Robert Elliott, HPE Persistent Memory