On 21/12/2020 11:00, Dmitry Kadashev wrote: [snip] >> We do not share rings between processes. Our rings are accessible from different >> threads (under locks), but nothing fancy. >> >>> In other words, if you kill all your io_uring applications, does it >>> go back to normal? >> >> I'm pretty sure it does not, the only fix is to reboot the box. But I'll find an >> affected box and double check just in case. I can't spot any misaccounting, but I wonder if it can be that your memory is getting fragmented enough to be unable make an allocation of 16 __contiguous__ pages, i.e. sizeof(sqe) * 1024 That's how it's allocated internally: static void *io_mem_alloc(size_t size) { gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP | __GFP_NORETRY; return (void *) __get_free_pages(gfp_flags, get_order(size)); } What about smaller rings? Can you check io_uring of what SQ size it can allocate? That can be a different program, e.g. modify a bit liburing/test/nop. Also, can you allocate it if you switch a user (preferably to non-root) after it happens? > > So, I've just tried stopping everything that uses io-uring. No io_wq* processes > remained: > > $ ps ax | grep wq > 9 ? I< 0:00 [mm_percpu_wq] > 243 ? I< 0:00 [tpm_dev_wq] > 246 ? I< 0:00 [devfreq_wq] > 27922 pts/4 S+ 0:00 grep --colour=auto wq > $ > > But not a single ring (with size 1024) can be created afterwards anyway. > > Apparently the problem netty hit and this one are different? -- Pavel Begunkov