Hi, One of the more expensive parts of io_req_local_work_add() is that it has to pull in the remote task tctx to check for the very unlikely event that we are in a cancelation state. Cache the cancelation state in each ctx instead. This makes the marking of cancelation (and clearing) a bit more expensive, but those are very slow path operations. The upside is that io_req_local_work_add() becomes a lot cheaper, which is what we care about. -- Jens Axboe