Really nice work, I have a question though. It is possible to efficiently wait for request completions from multiple threads? Like, two threads are entering " io_uring_enter" both with min_complete=1 while the completion ring holds 2 events - will the first one goes to thread 1 and the second one to thread 2? I just do not understand exactly the best way to scale this api into multiple threads... with IOCP for example is is perfectly clear.