On Thu, Oct 22 2020 at 09:28, Marcelo Tosatti wrote: > On Wed, Oct 21, 2020 at 10:25:48PM +0200, Thomas Gleixner wrote: >> The right answer to this is to utilize managed interrupts and have >> according logic in your network driver to handle CPU hotplug. When a CPU >> goes down, then the queue which is associated to that CPU is quiesced >> and the interrupt core shuts down the relevant interrupt instead of >> moving it to an online CPU (which causes the whole vector exhaustion >> problem on x86). When the CPU comes online again, then the interrupt is >> reenabled in the core and the driver reactivates the queue. > > Aha... But it would be necessary to do that from userspace (for runtime > isolate/unisolate). For anything which uses managed interrupts this is a non-problem and userspace has absolutely no business with it. Isolation does not shut down queues, at least not the block multi-queue ones which are only active when I/O is issued from that isolated CPU. So transitioning out of isolation requires no action at all. Transitioning in or changing the housekeeping mask needs some trivial tweak to handle the case where there is an overlap in the cpuset of a queue (housekeeping and isolated). This is handled already for setup and affinity changes, but of course not for runtime isolation mask changes, but that's a trivial thing to do. What's more interesting is how to deal with the network problem where there is no guarantee that the "response" ends up on the same queue as the "request" which is what the block people rely on. And that problem is not really an interrupt affinity problem in the first place. Thanks, tglx