Re: [PATCH v6 0/5] Fix potential kernel panic when increase hardware queue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 13, 2020 at 7:08 AM Bart Van Assche <bvanassche@xxxxxxx> wrote:
>
> On 2020-05-12 05:20, Weiping Zhang wrote:
> > On Tue, May 12, 2020 at 8:09 PM Weiping Zhang <zwp10758@xxxxxxxxx> wrote:
> >> I don't test block/030, since I don't pull blktest very often.
>
> That's unfortunate ...
>
> >> It's a different problem,
> >> because the mapping cann't be reset when do fallback, so the
> >> cpu[>=1] will point to a hctx(!=0).
> >>
> >>  it should be fixed by:
> >>
> >> diff --git a/block/blk-mq.c b/block/blk-mq.c
> >> index bc34d6b572b6..d82cefb0474f 100644
> >> --- a/block/blk-mq.c
> >> +++ b/block/blk-mq.c
> >> @@ -3365,8 +3365,8 @@ static void __blk_mq_update_nr_hw_queues(struct
> >> blk_mq_tag_set *set,
> >>                 goto reregister;
> >>
> >>         set->nr_hw_queues = nr_hw_queues;
> >> -       blk_mq_update_queue_map(set);
> >>  fallback:
> >> +       blk_mq_update_queue_map(set);
> >>         list_for_each_entry(q, &set->tag_list, tag_set_list) {
> >>                 blk_mq_realloc_hw_ctxs(set, q);
> >>                 if (q->nr_hw_queues != set->nr_hw_queues) {
>
> If this is posted as a patch, feel free to add:
>
> Tested-by: Bart van Assche <bvanassche@xxxxxxx>
>
Post it latter, thank you

> > And block/030 should also be improved ?
> >
> >  35         # Since older null_blk versions do not allow "submit_queues" to be
> >  36         # modified, check first whether that configs attribute is writeable.
> >  37         # Each iteration of the loop below triggers $(nproc) + 1
> >  38         # null_init_hctx() calls. Since <interval>=$(nproc), all possible
> >  39         # blk_mq_realloc_hw_ctxs() error paths will be triggered. Whether or
> >  40         # not this test succeeds depends on whether or not _check_dmesg()
> >  41         # detects a kernel warning.
> >  42         if { echo "$(<"$sq")" >$sq; } 2>/dev/null; then
> >  43                 for ((i = 0; i < 100; i++)); do
> >  44                         echo 1 > $sq
> >  45                         nproc > $sq  # this line output lots
> > "nproc: write error: Cannot allocate memory"
> >  46                 done
> >  47         else
> >  48                 SKIP_REASON="Skipping test because $sq cannot be modified"
> >  49         fi
> >
> >
> > The test result show this test case [failed], actually it [pass],
> > there is no warning detect
> > in kernel log, if apply above patch.
> >
> > block/030 (trigger the blk_mq_realloc_hw_ctxs() error path)  [failed]
> >     runtime  1.999s  ...  2.115s
> >     --- tests/block/030.out 2020-05-12 10:42:26.345782849 +0800
> >     +++ /data1/zwp/src/blktests/results/nodev/block/030.out.bad
> > 2020-05-12 20:14:59.878915218 +0800
> >     @@ -1 +1,51 @@
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     +nproc: write error: Cannot allocate memory
> >     ...
> >     (Run 'diff -u tests/block/030.out
> > /data1/zwp/src/blktests/results/nodev/block/030.out.bad' to see the
> > entire diff)
>
> That's weird. I have not yet encountered this. Test block/030 passes on
> my setup.
>
> Thanks,
>
> Bart.



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux