On Tue, Dec 22, 2020 at 11:11 AM Pavel Begunkov <asml.silence@xxxxxxxxx> wrote: > > On 22/12/2020 03:35, Pavel Begunkov wrote: > > On 21/12/2020 11:00, Dmitry Kadashev wrote: > > [snip] > >>> We do not share rings between processes. Our rings are accessible from different > >>> threads (under locks), but nothing fancy. > >>> > >>>> In other words, if you kill all your io_uring applications, does it > >>>> go back to normal? > >>> > >>> I'm pretty sure it does not, the only fix is to reboot the box. But I'll find an > >>> affected box and double check just in case. > > > > I can't spot any misaccounting, but I wonder if it can be that your memory is > > getting fragmented enough to be unable make an allocation of 16 __contiguous__ > > pages, i.e. sizeof(sqe) * 1024 > > > > That's how it's allocated internally: > > > > static void *io_mem_alloc(size_t size) > > { > > gfp_t gfp_flags = GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP | > > __GFP_NORETRY; > > > > return (void *) __get_free_pages(gfp_flags, get_order(size)); > > } > > > > What about smaller rings? Can you check io_uring of what SQ size it can allocate? > > That can be a different program, e.g. modify a bit liburing/test/nop. > > Even better to allocate N smaller rings, where N = 1024 / SQ_size > > static int try_size(int sq_size) > { > int ret = 0, i, n = 1024 / sq_size; > static struct io_uring rings[128]; > > for (i = 0; i < n; ++i) { > if (io_uring_queue_init(sq_size, &rings[i], 0) < 0) { > ret = -1; > break; > } > } > for (i -= 1; i >= 0; i--) > io_uring_queue_exit(&rings[i]); > return ret; > } > > int main() > { > int size; > > for (size = 1024; size >= 2; size /= 2) { > if (!try_size(size)) { > printf("max size %i\n", size); > return 0; > } > } > > printf("can't allocate %i\n", size); > return 0; > } Unfortunately I've rebooted the box I've used for tests yesterday, so I can't try this there. Also I was not able to come up with an isolated reproducer for this yet. The good news is I've found a relatively easy way to provoke this on a test VM using our software. Our app runs with "admin" user perms (plus some capabilities), it bumps RLIMIT_MEMLOCK to infinity on start. I've also created an user called 'ioutest' to run the check for ring sizes using a different user. I've modified the test program slightly, to show the number of rings successfully created on each iteration and the actual error message (to debug a problem I was having with it, but I've kept this after that). Here is the output: # sudo -u admin bash -c 'ulimit -a' | grep locked max locked memory (kbytes, -l) 1024 # sudo -u ioutest bash -c 'ulimit -a' | grep locked max locked memory (kbytes, -l) 1024 # sudo -u admin ./iou-test1 Failed after 0 rings with 1024 size: Cannot allocate memory Failed after 0 rings with 512 size: Cannot allocate memory Failed after 0 rings with 256 size: Cannot allocate memory Failed after 0 rings with 128 size: Cannot allocate memory Failed after 0 rings with 64 size: Cannot allocate memory Failed after 0 rings with 32 size: Cannot allocate memory Failed after 0 rings with 16 size: Cannot allocate memory Failed after 0 rings with 8 size: Cannot allocate memory Failed after 0 rings with 4 size: Cannot allocate memory Failed after 0 rings with 2 size: Cannot allocate memory can't allocate 1 # sudo -u ioutest ./iou-test1 max size 1024 # ps ax | grep wq 8 ? I< 0:00 [mm_percpu_wq] 121 ? I< 0:00 [tpm_dev_wq] 124 ? I< 0:00 [devfreq_wq] 20593 pts/1 S+ 0:00 grep --color=auto wq -- Dmitry Kadashev