On 5/20/20 8:45 AM, Jens Axboe wrote: > On 5/20/20 2:03 AM, Christoph Hellwig wrote: >> On Wed, May 20, 2020 at 11:04:24AM +0800, Ming Lei wrote: >>> On Wed, May 20, 2020 at 09:18:23AM +0800, Ming Lei wrote: >>>> On Tue, May 19, 2020 at 05:30:00PM +0200, Christoph Hellwig wrote: >>>>> On Tue, May 19, 2020 at 09:54:20AM +0800, Ming Lei wrote: >>>>>> As Thomas clarified, workqueue hasn't such issue any more, and only other >>>>>> per CPU kthreads can run until the CPU clears the online bit. >>>>>> >>>>>> So the question is if IO can be submitted from such kernel context? >>>>> >>>>> What other per-CPU kthreads even exist? >>>> >>>> I don't know, so expose to wider audiences. >>> >>> One user is io uring with IORING_SETUP_SQPOLL & IORING_SETUP_SQ_AFF, see >>> io_sq_offload_start(), and it is a IO submission kthread. >> >> As far as I can tell that code is buggy, as it still needs to migrate >> the thread away when the cpu is offlined. This isn't a per-cpu kthread >> in the sene of having one for each CPU. >> >> Jens? > > It just uses kthread_create_on_cpu(), nothing home grown. Pretty sure > they just break affinity if that CPU goes offline. Just checked, and it works fine for me. If I create an SQPOLL ring with SQ_AFF set and bound to CPU 3, if CPU 3 goes offline, then the kthread just appears unbound but runs just fine. When CPU 3 comes online again, the mask appears correct. So don't think there's anything wrong on that side. The affinity is a performance optimization, not a correctness issue. Really not much we can do if the chosen CPU is offlined, apart from continue to chug along. -- Jens Axboe