On Tue, Jun 15, 2021 at 03:10:31PM +0200, Christoph Hellwig wrote: > Replace the blk_poll interface that requires the caller to keep a queue > and cookie from the submissions with polling based on the bio. > > Polling for the bio itself leads to a few advantages: > > - the cookie construction can made entirely private in blk-mq.c > - the caller does not need to remember the request_queue and cookie > separately and thus sidesteps their lifetime issues > - keeping the device and the cookie inside the bio allows to trivially > support polling BIOs remapping by stacking drivers > - a lot of code to propagate the cookie back up the submission path can > be removed entirely. > ... > +/** > + * bio_poll - poll for BIO completions > + * @bio: bio to poll for > + * @flags: BLK_POLL_* flags that control the behavior > + * > + * Poll for completions on queue associated with the bio. Returns number of > + * completed entries found. > + * > + * Note: the caller must either be the context that submitted @bio, or > + * be in a RCU critical section to prevent freeing of @bio. > + */ > +int bio_poll(struct bio *bio, unsigned int flags) > +{ > + struct request_queue *q = bio->bi_bdev->bd_disk->queue; > + blk_qc_t cookie = READ_ONCE(bio->bi_cookie); > + int ret; > + > + if (cookie == BLK_QC_T_NONE || > + !test_bit(QUEUE_FLAG_POLL, &q->queue_flags)) > + return 0; > + > + if (current->plug) > + blk_flush_plug_list(current->plug, false); > + > + if (blk_queue_enter(q, BLK_MQ_REQ_NOWAIT)) > + return 0; > + if (WARN_ON_ONCE(!queue_is_mq(q))) > + ret = 0; /* not yet implemented, should not happen */ > + else > + ret = blk_mq_poll(q, cookie, flags); > + blk_queue_exit(q); > + return ret; > +} > +EXPORT_SYMBOL_GPL(bio_poll); > + > +/* > + * Helper to implement file_operations.iopoll. Requires the bio to be stored > + * in iocb->private, and cleared before freeing the bio. > + */ > +int iocb_bio_iopoll(struct kiocb *kiocb, unsigned int flags) > +{ > + struct bio *bio; > + int ret = 0; > + > + /* > + * Note: the bio cache only uses SLAB_TYPESAFE_BY_RCU, so bio can > + * point to a freshly allocated bio at this point. If that happens > + * we have a few cases to consider: > + * > + * 1) the bio is beeing initialized and bi_bdev is NULL. We can just > + * simply nothing in this case > + * 2) the bio points to a not poll enabled device. bio_poll will catch > + * this and return 0 > + * 3) the bio points to a poll capable device, including but not > + * limited to the one that the original bio pointed to. In this > + * case we will call into the actual poll method and poll for I/O, > + * even if we don't need to, but it won't cause harm either. > + * > + * For cases 2) and 3) above the RCU grace period ensures that bi_bdev > + * is still allocated. Because partitions hold a reference to the whole > + * device bdev and thus disk, the disk is also still valid. Grabbing > + * a reference to the queue in bio_poll() ensures the hctxs and requests > + * are still valid as well. > + */ Not sure disk is valid, we only hold the disk when opening a bdev, but the bdev can be closed during polling. Also disk always holds one reference on request queue, so if disk is valid, no need to grab queue's refcnt in bio_poll(). Thanks, Ming