Re: [PATCH] blk-iolatency: fix IO hang due to negative inflight counter

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/18/19 9:21 AM, Josef Bacik wrote:
> On Fri, Jan 18, 2019 at 05:58:18AM -0700, Jens Axboe wrote:
>> On 1/14/19 12:21 PM, Liu Bo wrote:
>>> Our test reported the following stack, and vmcore showed that
>>> ->inflight counter is -1.
>>>
>>> [ffffc9003fcc38d0] __schedule at ffffffff8173d95d
>>> [ffffc9003fcc3958] schedule at ffffffff8173de26
>>> [ffffc9003fcc3970] io_schedule at ffffffff810bb6b6
>>> [ffffc9003fcc3988] blkcg_iolatency_throttle at ffffffff813911cb
>>> [ffffc9003fcc3a20] rq_qos_throttle at ffffffff813847f3
>>> [ffffc9003fcc3a48] blk_mq_make_request at ffffffff8137468a
>>> [ffffc9003fcc3b08] generic_make_request at ffffffff81368b49
>>> [ffffc9003fcc3b68] submit_bio at ffffffff81368d7d
>>> [ffffc9003fcc3bb8] ext4_io_submit at ffffffffa031be00 [ext4]
>>> [ffffc9003fcc3c00] ext4_writepages at ffffffffa03163de [ext4]
>>> [ffffc9003fcc3d68] do_writepages at ffffffff811c49ae
>>> [ffffc9003fcc3d78] __filemap_fdatawrite_range at ffffffff811b6188
>>> [ffffc9003fcc3e30] filemap_write_and_wait_range at ffffffff811b6301
>>> [ffffc9003fcc3e60] ext4_sync_file at ffffffffa030cee8 [ext4]
>>> [ffffc9003fcc3ea8] vfs_fsync_range at ffffffff8128594b
>>> [ffffc9003fcc3ee8] do_fsync at ffffffff81285abd
>>> [ffffc9003fcc3f18] sys_fsync at ffffffff81285d50
>>> [ffffc9003fcc3f28] do_syscall_64 at ffffffff81003c04
>>> [ffffc9003fcc3f50] entry_SYSCALL_64_after_swapgs at ffffffff81742b8e
>>>
>>> The ->inflight counter may be negative (-1) if
>>>
>>> 0) blk-throttle had been enabled when the IO was issued, so its bio
>>> has a associated blkg,
>>>
>>> 1) blk-iolatency was disabled when the IO was issued, so iolatency_grp
>>> in this blkg was not available by then,
>>>
>>> 2) blk-iolatency was enabled before this IO reached its endio, so that
>>> iolatency_grp became available when the IO did the endio.
>>>
>>> 3) the ->inflight counter is decreased from 0 to -1.
>>>
>>> This uses atomic_dec_is_positive() instead to avoid the negative
>>> inflight counter.
>>
>> The problem with that is that it'll hide a lot of other issues, too.
>> Any way we can either track if this rqw is in flight, and only dec
>> if it is, or quiesce when enabling?
>>
> 
> I worried about this too, but really the side-effect of allowing more through
> because of mis-counting means we just let more IO through.  I think maybe we add
> a debug option that we can turn on to see if we're messing up accounting, but in
> general I don't see a problem with this approach.

The problem is that a problem in accounting elsewhere (like missing increment)
will now go unnoticed, which could completely screw it up. An occasional
blip like the one described is totally fine, but that's not guaranteed to be
the case.

> The problem we're running into here is there's not really a good way to tag a
> bio as "seen by io.latency."  We just have to assume if we're on and there's a
> bi_blkg associated that we saw it at submit time.  We can't just add a flag for
> every io controller that starts tracking inflight io's, so for now I think this
> is a reasonable solution.  Thanks,

Can we quiesce instead when enabling/disabling?

-- 
Jens Axboe




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux