Re: [PATCH 2/2] blk-mq: ensure a q_usage_counter reference is held when splitting bios

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/11/24 10:10 AM, Christoph Hellwig wrote:
> On Thu, Jan 11, 2024 at 09:17:41AM -0700, Jens Axboe wrote:
>> On 1/11/24 9:14 AM, Christoph Hellwig wrote:
>>> On Thu, Jan 11, 2024 at 09:12:23AM -0700, Jens Axboe wrote:
>>>> On 1/11/24 6:57 AM, Christoph Hellwig wrote:
>>>>> q_usage_counter is the only thing preventing us from the limits changing
>>>>> under us in __bio_split_to_limits, but blk_mq_submit_bio doesn't hold it.
>>>>>
>>>>> Change __submit_bio to always acquire the q_usage_counter counter before
>>>>> branching out into bio vs request based helper, and let blk_mq_submit_bio
>>>>> tell it if it consumed the reference by handing it off to the request.
>>>>
>>>> This causes hangs for me on shutdown/reset:
>>>
>>>> which seems to indicate that a reference is being leaked. Haven't poked
>>>> any further at it, I'll drop these two for now.
>>>
>>> Can you send me your .config?
>>
>> Don't think it's .config related, hit it on my test box and then in my
>> vm as well. It's likely due to batched allocations, would be my guess.
>> I can start/halt the vm fine with just a boot, but if I do:
> 
> Yupp, cached rqs it was.  The incremental patch below fixes it.
> Can you add some cached request testing to blktests so that this gets
> covered by default?
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 421db29535ba50..39eb4b99320c11 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2967,8 +2967,14 @@ bool blk_mq_submit_bio(struct bio *bio)
>  		if (rq && rq->q == q) {
>  			if (blk_mq_attempt_bio_merge(q, bio, nr_segs))
>  				return false;
> -			if (blk_mq_use_cached_rq(rq, plug, bio))
> +			if (blk_mq_use_cached_rq(rq, plug, bio)) {
> +				/*
> +				 * We're using the reference already owned by
> +				 * rq from here on.
> +				 */
> +				blk_queue_exit(q);
>  				goto has_rq;
> +			}
>  		}
>  	}

This also highlights a potential inefficiency in the patch, as now we're
grabbing+dropping references when we don't need to. May not be a big
deal, but it's one of the things that cached requests got rid of. Though
I'm not quite sure how to refactor to get rid of that, as we'd need to
shuffle the splitting and request get for that.

Could you take another look at the series with that in mind?

-- 
Jens Axboe





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux