On 8/9/21 6:04 AM, Pavel Begunkov wrote: > Checking flags is a bit faster and can be batched, but the main reason > of controlling ->async_data with req->flags but not relying on NULL is > that we safely move it now to the end of io_kiocb, where cachelines are > rarely loaded, and use that freed space for something more hot like > io_mapped_ubuf. As far as I can tell, this will run into an issue with double poll: static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head, struct poll_table_struct *p) { struct io_poll_table *pt = container_of(p, struct io_poll_table, pt); __io_queue_proc(&pt->req->poll, pt, head, (struct io_poll_iocb **) &pt->req->async_data); } where we store the potential extra allocation, if any, in the async_data field. That also needs to get freed when we release this request. One solution would be to just set REQ_F_ASYNC_DATA before calling __io_queue_proc(). > @@ -3141,6 +3150,8 @@ static inline int io_alloc_async_data(struct io_kiocb *req) > { > WARN_ON_ONCE(!io_op_defs[req->opcode].async_size); > req->async_data = kmalloc(io_op_defs[req->opcode].async_size, GFP_KERNEL); > + if (req->async_data) > + req->flags |= REQ_F_ASYNC_DATA; > return req->async_data == NULL; > } With this change, would be better to simply do: if (req->async_data) { req->flags |= REQ_F_ASYNC_DATA; return false; } return true; -- Jens Axboe