Hi, For v1 and replies to that and tons of perf measurements, go here: https://lore.kernel.org/io-uring/3d553205-0fe2-482e-8d4c-a4a1ad278893@xxxxxxxxx/T/#m12f44c0a9ee40a59b0dcc226e22a0d031903aa73 as I won't duplicate them in here. Performance has been improved since v1 as well, as the slab accounting is gone and we now rely soly on the completion_lock on the issuer side. Changes since v1: - Change commit messages to reflect it's DEFER_TASKRUN, not SINGLE_ISSUER - Get rid of the need to double lock on the target uring_lock - Relax the check for needing remote posting, and then finally kill it - Unify it across ring types - Kill (now) unused callback_head in io_msg - Add overflow caching to avoid __GFP_ACCOUNT overhead - Rebase on current git master with 6.9 and 6.10 fixes pulled in -- Jens Axboe