On Fri, 26 Feb 2021, JeffleXu wrote: > > > On 2/20/21 3:38 AM, Mikulas Patocka wrote: > > > > > > On Mon, 8 Feb 2021, Jeffle Xu wrote: > > > >> Offer one fastpath of bio-based polling when bio submitted to dm device > >> is not split. > >> > >> In this case, there will be only one bio submitted to only one polling > >> hw queue of one underlying mq device, and thus we don't need to track > >> all split bios or iterate through all polling hw queues. The pointer to > >> the polling hw queue the bio submitted to is returned here as the > >> returned cookie. > > > > This doesn't seem safe - note that between submit_bio() and blk_poll(), no > > locks are held - so the device mapper device may be reconfigured > > arbitrarily. When you call blk_poll() with a pointer returned by > > submit_bio(), the pointer may point to a stale address. > > > > Thanks for the feedback. Indeed maybe it's not a good idea to directly > return a 'struct blk_mq_hw_ctx *' pointer as the returned cookie. > > Currently I have no idea to fix it, orz... The > blk_get_queue()/blk_put_queue() tricks may not work in this case. > Because the returned cookie may not be used at all. Before calling > blk_poll(), the polling routine may find that the corresponding IO has > already completed, and thus won't call blk_poll(), in which case we have > no place to put the refcount. > > But I really don't want to drop this optimization, since this > optimization is quite intuitive when dm device maps to a lot of > underlying devices. Though this optimization doesn't actually achieve > reasonable performance gain in my test, maybe because there are at most > seven nvme devices in my test machine. > > Any thoughts? > > Thanks, > Jeffle Hi I reworked device mapper polling, so that we poll in the function __split_and_process_bio. The pointer to a queue and the polling cookie is passed only inside device mapper code, it never leaves it. I'll send you my patches - try them and tell me how does it perform compared to your patchset. Mikulas