Re: [PATCH 3/3] blk-mq: make the polling code adaptive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 11, 2016 at 10:11:27PM -0700, Jens Axboe wrote:
> The previous commit introduced the hybrid sleep/poll mode. Take
> that one step further, and use the completion latencies to
> automatically sleep for half the mean completion time. This is
> a good approximation.
> 
> This changes the 'io_poll_delay' sysfs file a bit to expose the
> various options. Depending on the value, the polling code will
> behave differently:
> 
> -1	Never enter hybrid sleep mode
>  0	Use half of the completion mean for the sleep delay
> >0	Use this specific value as the sleep delay
> 
> Signed-off-by: Jens Axboe <axboe@xxxxxx>
> ---
>  block/blk-mq.c         | 74 ++++++++++++++++++++++++++++++++++++++++++++++----
>  block/blk-sysfs.c      | 26 ++++++++++++------
>  include/linux/blkdev.h |  2 +-
>  3 files changed, 88 insertions(+), 14 deletions(-)
> 

[snip]

>  static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
> +				     struct blk_mq_hw_ctx *hctx,
>  				     struct request *rq)
>  {
>  	struct hrtimer_sleeper hs;
> +	enum hrtimer_mode mode;
> +	unsigned int nsecs;
>  	ktime_t kt;
>  
> -	if (!q->poll_nsec || test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
> +	if (test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
> +		return false;
> +
> +	/*
> +	 * poll_nsec can be:
> +	 *
> +	 * -1:	don't ever hybrid sleep
> +	 *  0:	use half of prev avg
> +	 * >0:	use this specific value
> +	 */
> +	if (q->poll_nsec == -1)
> +		return false;
> +	else if (q->poll_nsec > 0)
> +		nsecs = q->poll_nsec;
> +	else
> +		nsecs = blk_mq_poll_nsecs(q, hctx, rq);
> +
> +	if (!nsecs)
>  		return false;
>  
>  	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
> @@ -2477,9 +2539,10 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>  	 * This will be replaced with the stats tracking code, using
>  	 * 'avg_completion_time / 2' as the pre-sleep target.
>  	 */
> -	kt = ktime_set(0, q->poll_nsec);
> +	kt = ktime_set(0, nsecs);
>  
> -	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	mode = HRTIMER_MODE_REL;
> +	hrtimer_init_on_stack(&hs.timer, CLOCK_MONOTONIC, mode);
>  	hrtimer_set_expires(&hs.timer, kt);
>  
>  	hrtimer_init_sleeper(&hs, current);
> @@ -2487,10 +2550,11 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
>  		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
>  			break;
>  		set_current_state(TASK_UNINTERRUPTIBLE);
> -		hrtimer_start_expires(&hs.timer, HRTIMER_MODE_REL);
> +		hrtimer_start_expires(&hs.timer, mode);
>  		if (hs.task)
>  			io_schedule();
>  		hrtimer_cancel(&hs.timer);
> +		mode = HRTIMER_MODE_ABS;
>  	} while (hs.task && !signal_pending(current));

This fix should be folded into patch 2.

>  	__set_current_state(TASK_RUNNING);
> @@ -2510,7 +2574,7 @@ static bool __blk_mq_poll(struct blk_mq_hw_ctx *hctx, struct request *rq)
>  	 * the IO isn't complete, we'll get called again and will go
>  	 * straight to the busy poll loop.
>  	 */
> -	if (blk_mq_poll_hybrid_sleep(q, rq))
> +	if (blk_mq_poll_hybrid_sleep(q, hctx, rq))
>  		return true;
>  
>  	hctx->poll_considered++;

[snip]

-- 
Omar
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux