Re: Upcoming merge window

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/17/18 8:45 PM, Mike Snitzer wrote:
> On Mon, Dec 17 2018 at  7:26pm -0500,
> Jens Axboe <axboe@xxxxxxxxx> wrote:
> 
>> On 12/17/18 5:16 PM, Jens Axboe wrote:
>>> On 12/17/18 4:49 PM, Jens Axboe wrote:
>>>> On 12/17/18 4:27 PM, Jens Axboe wrote:
>>>>> On 12/17/18 4:16 PM, Bart Van Assche wrote:
>>>>>> On Mon, 2018-12-17 at 11:28 -0700, Jens Axboe wrote:
>>>>>>> As I'm sure you're all aware, the merge window is coming up. This time
>>>>>>> it happens to coincide with that is a holiday for most. My plan is to
>>>>>>> send in an EARLY pull request to Linus, Thursday at the latest. If you're
>>>>>>> sitting on anything that should go in with the initial merge, then I need
>>>>>>> to have it ASAP.
>>>>>>>
>>>>>>> I'll do a later pull about a week in with things that were missed, but
>>>>>>> I'm really hoping to make that fixes only. Any driver updates etc should
>>>>>>> go in now.
>>>>>>
>>>>>> Hi Jens,
>>>>>>
>>>>>> If I run blktests/srp/002 against Linus' master branch then that test passes,
>>>>>> no matter how many times I run that test. If I run that test against your
>>>>>> for-next branch however (commit 6a252f2772c0) then that test hangs. The output
>>>>>> of my list-pending-block-requests script is as follows when the hang occurs:
>>>>>
>>>>> Ugh, I'll try and run that here again, that test is unfortunately such a pain
>>>>> to run and requires me to manually install multipath libs (and remember to
>>>>> uninstall before rebooting, or udev fails?).
>>>>>
>>>>> I'll take a look!
>>>>
>>>> Looks like what Ming was talking about. CC'ing Ming and Mike. Lots of
>>>> kworkers are stuck like this:
>>>>
>>>> [  252.310187] kworker/2:19    D14072  8147      2 0x80000000
>>>> [  252.316803] Workqueue: dio/dm-2 dio_aio_complete_work
>>>> [  252.322925] Call Trace:
>>>> [  252.326137]  ? __schedule+0x231/0x5f0
>>>> [  252.330703]  schedule+0x2a/0x80
>>>> [  252.334689]  rwsem_down_write_failed+0x204/0x320
>>>> [  252.340330]  ? generic_make_request_checks+0x55/0x370
>>>> [  252.346542]  ? call_rwsem_down_write_failed+0x13/0x20
>>>> [  252.352669]  call_rwsem_down_write_failed+0x13/0x20
>>>> [  252.358601]  down_write+0x1b/0x30
>>>> [  252.362781]  __generic_file_fsync+0x3e/0xb0
>>>> [  252.367933]  ext4_sync_file+0xcc/0x2e0
>>>> [  252.372599]  dio_complete+0x1c4/0x210
>>>> [  252.377168]  process_one_work+0x1cb/0x350
>>>> [  252.382915]  worker_thread+0x28/0x3c0
>>>> [  252.387482]  ? process_one_work+0x350/0x350
>>>> [  252.392632]  kthread+0x107/0x120
>>>> [  252.396717]  ? kthread_park+0x80/0x80
>>>> [  252.401285]  ret_from_fork+0x1f/0x30
>>>>
>>>> Where did this regression come from? This was passing just fine
>>>> recently.
>>>
>>> Looks like this is the offending commit:
>>>
>>> commit c4576aed8d85d808cd6443bda58393d525207d01
>>> Author: Mike Snitzer <snitzer@xxxxxxxxxx>
>>> Date:   Tue Dec 11 09:10:26 2018 -0500
>>>
>>>     dm: fix request-based dm's use of dm_wait_for_completion
>>
>> Yep confirmed, reverted that on top and it passes. dm-2 has plenty of
>> requests that are allocated and pending dispatch, so the md_in_flight()
>> will return true. Mike, should it be checking for allocated requests or
>> in-flight?
> 
> I thought we could just check for allocated (as blk_mq_check_busy() does
> now) but clearly that is too broad a scope because I tested your
> suggestion and it allows the srp/002 test to pass:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 6847f014606b..edbf4bb1b3e8 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -812,7 +812,7 @@ static bool blk_mq_check_busy(struct blk_mq_hw_ctx *hctx, struct request *rq,
>          * If we find a request, we know the queue is busy. Return false
>          * to stop the iteration.
>          */
> -       if (rq->q == hctx->queue) {
> +       if (rq->state == MQ_RQ_IN_FLIGHT && rq->q == hctx->queue) {
>                 bool *busy = priv;
> 
>                 *busy = true;
> 
> blk_mq_check_busy() was introduced for DM to user as a replacement for
> its own inflight accounting it was doing:
>   ae879912 blk-mq: provide a helper to check if a queue is busy
> 
> So nothing else is currently calling it, but if you'd prefer to rename
> the functions to reflect the narrower MQ_RQ_IN_FLIGHT check that is fine
> by me (e.g. blk_mq_check_inflight and blk_mq_queue_has_inflight).

I agree, let's do the fix and rename it to inflight instead, since that
now reflects what it does.

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux