Re: [PATCH v3 09/11] dm: support IO polling for bio-based dm device

Ming Lei <ming.lei@xxxxxxxxxx> · Tue, 9 Feb 2021 16:07:39 +0800

On Tue, Feb 09, 2021 at 02:13:38PM +0800, JeffleXu wrote:
> 
> 
> On 2/9/21 11:11 AM, Ming Lei wrote:
> > On Mon, Feb 08, 2021 at 04:52:41PM +0800, Jeffle Xu wrote:
> >> DM will iterate and poll all polling hardware queues of all target mq
> >> devices when polling IO for dm device. To mitigate the race introduced
> >> by iterating all target hw queues, a per-hw-queue flag is maintained
> > 
> > What is the per-hw-queue flag?
> 
> Sorry I forgot to update the commit message as the implementation
> changed. Actually this mechanism is implemented by patch 10 of this
> patch set.

It is hard to associate patch 10's spin_trylock() with per-hw-queue
flag. Also scsi's poll implementation is in-progress, and scsi's poll may
not be implemented in this way.

> 
> > 
> >> to indicate whether this polling hw queue currently being polled on or
> >> not. Every polling hw queue is exclusive to one polling instance, i.e.,
> >> the polling instance will skip this polling hw queue if this hw queue
> >> currently is being polled by another polling instance, and start
> >> polling on the next hw queue.
> > 
> > Not see such skip in dm_poll_one_dev() in which
> > queue_for_each_poll_hw_ctx() is called directly for polling all POLL
> > hctxs of the request queue, so can you explain it a bit more about this
> > skip mechanism?
> > 
> 
> It is implemented as patch 10 of this patch set. When spin_trylock()
> fails, the polling instance will return immediately, instead of busy
> waiting.
> 
> 
> > Even though such skipping is implemented, not sure if good performance
> > can be reached because hctx poll may be done in ping-pong style
> > among several CPUs. But blk-mq hctx is supposed to have its cpu affinities.
> > 
> 
> Yes, the mechanism of iterating all hw queues can make the competition
> worse.
> 
> If every underlying data device has **only** one polling hw queue, then
> this ping-pong style polling still exist, even when we implement split
> bio tracking mechanism, i.e., acquiring the specific hw queue the bio
> enqueued into. Because multiple polling instance has to compete for the
> only polling hw queue.
> 
> But if multiple polling hw queues per device are reserved for multiple
> polling instances, (e.g., every underlying data device has 3 polling hw
> queues when there are 3 polling instances), just as what we practice on
> mq polling, then the current implementation of iterating all hw queues
> will indeed works in a ping-pong style, while this issue shall not exist
> when accurate split bio tracking mechanism could be implemented.

In reality it could be possible to have one hw queue for each numa node.

And you may re-use blk_mq_map_queue() for getting the proper hw queue for poll.

-- 
Ming