Re: [PATCH 3/3] blk-mq: unify hctx delay_work and run_work

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 04/10/2017 10:21 AM, Bart Van Assche wrote:
> On Mon, 2017-04-10 at 09:54 -0600, Jens Axboe wrote:
>> @@ -1281,27 +1280,39 @@ static void blk_mq_run_work_fn(struct work_struct *work)
>>  	struct blk_mq_hw_ctx *hctx;
>>  
>>  	hctx = container_of(work, struct blk_mq_hw_ctx, run_work.work);
>> -	__blk_mq_run_hw_queue(hctx);
>> -}
>>  
>> -static void blk_mq_delay_work_fn(struct work_struct *work)
>> -{
>> -	struct blk_mq_hw_ctx *hctx;
>> +	/*
>> +	 * If we are stopped, don't run the queue. The exception is if
>> +	 * BLK_MQ_S_START_ON_RUN is set. For that case, we auto-clear
>> +	 * the STOPPED bit and run it.
>> +	 */
>> +	if (test_bit(BLK_MQ_S_STOPPED, &hctx->state)) {
>> +		if (!test_bit(BLK_MQ_S_START_ON_RUN, &hctx->state))
>> +			return;
>>  
>> -	hctx = container_of(work, struct blk_mq_hw_ctx, delay_work.work);
>> +		clear_bit(BLK_MQ_S_START_ON_RUN, &hctx->state);
>> +		clear_bit(BLK_MQ_S_STOPPED, &hctx->state);
>> +	}
>>  
>> -	if (test_and_clear_bit(BLK_MQ_S_STOPPED, &hctx->state))
>> -		__blk_mq_run_hw_queue(hctx);
>> +	__blk_mq_run_hw_queue(hctx);
>>  }
>>  
>> +
>>  void blk_mq_delay_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)
>>  {
>>  	if (unlikely(!blk_mq_hw_queue_mapped(hctx)))
>>  		return;
>>  
>> +	/*
>> +	 * Stop the hw queue, then modify currently delayed work.
>> +	 * This should prevent us from running the queue prematurely.
>> +	 * Mark the queue as auto-clearing STOPPED when it runs.
>> +	 */
>>  	blk_mq_stop_hw_queue(hctx);
>> -	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
>> -			&hctx->delay_work, msecs_to_jiffies(msecs));
>> +	set_bit(BLK_MQ_S_START_ON_RUN, &hctx->state);
>> +	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
>> +					&hctx->run_work,
>> +					msecs_to_jiffies(msecs));
>>  }
>>  EXPORT_SYMBOL(blk_mq_delay_queue);
> 
> Hello Jens,
> 
> Is it possible for a block driver to call blk_mq_delay_queue() while
> blk_mq_delay_work_fn() is running? Can that result in BLK_MQ_S_STOPPED
> and BLK_MQ_S_START_ON_RUN being checked by blk_mq_delay_work_fn() after
> blk_mq_delay_queue() has set BLK_MQ_S_STOPPED and before it has set
> BLK_MQ_S_START_ON_RUN?

Yeah, I don't think it's bullet proof. I just looked at the in-kernel
users, and there's just one, nvme/fc.c. And it looks really buggy,
since it manually stops _all_ queues, then delays the one hw queue.
So we'll end up with potentially a bunch of stopped queues, and only
one getting restarted.

I think for blk_mq_delay_queue(), what we really care about is "this
queue has to run again sometime in the future". If that happens
much sooner than asked for, that's OK, the caller will just delay
again if it needs it. For most cases, we'll be close.

Obviously we can't have ordering issues and end up in a bad state,
we have to prevent that.

I'll fiddle with this a bit more.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux