Re: [External] Re: raid5 deadlock issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 10, 2022 at 10:25 AM Zhang Tianci
<zhangtianci.1997@xxxxxxxxxxxxx> wrote:
>

> > fio -filename=testfile -ioengine=libaio -bs=16M -size=10G -numjobs=100
> > -iodepth=1 -runtime=60
> > -rw=write -group_reporting -name="test"
> >
> > Then I found the first deadlock state, but it is not the real reason.
> >
> > I will do a test with the latest kernel. I will report to you the result later.
> >
> I can reproduce the first deadlock in linux-6.1-rc4.
> There are 26 stripe_head and 26 fio threads blocked with same backtrace:
>
>  #0 [ffffc9000cd0f8b0] __schedule at ffffffff818b3c3c
>  #1 [ffffc9000cd0f940] schedule at ffffffff818b4313
>  #2 [ffffc9000cd0f950] md_bitmap_startwrite at ffffffffc063354a [md_mod]
>  #3 [ffffc9000cd0f9c0] __add_stripe_bio at ffffffffc064fbd6 [raid456]
>  #4 [ffffc9000cd0fa00] raid5_make_request at ffffffffc065a84c [raid456]
>  #5 [ffffc9000cd0fb30] md_handle_request at ffffffffc0628496 [md_mod]
>  #6 [ffffc9000cd0fb98] __submit_bio at ffffffff813f308f
>  #7 [ffffc9000cd0fbb8] submit_bio_noacct_nocheck at ffffffff813f3501
>  #8 [ffffc9000cd0fc00] __block_write_full_page at ffffffff8134ca64
>  #9 [ffffc9000cd0fc60] __writepage at ffffffff8123f4a3
> #10 [ffffc9000cd0fc78] write_cache_pages at ffffffff8123fb57
> #11 [ffffc9000cd0fd70] generic_writepages at ffffffff8123feef
> #12 [ffffc9000cd0fdc0] do_writepages at ffffffff81241f12
> #13 [ffffc9000cd0fe28] filemap_fdatawrite_wbc at ffffffff8123306b
> #14 [ffffc9000cd0fe48] __filemap_fdatawrite_range at ffffffff81239154
> #15 [ffffc9000cd0fec0] file_write_and_wait_range at ffffffff812393e1
> #16 [ffffc9000cd0fef0] blkdev_fsync at ffffffff813ec223
> #17 [ffffc9000cd0ff08] do_fsync at ffffffff81342798
> #18 [ffffc9000cd0ff30] __x64_sys_fsync at ffffffff813427e0
> #19 [ffffc9000cd0ff38] do_syscall_64 at ffffffff818a6114
> #20 [ffffc9000cd0ff50] entry_SYSCALL_64_after_hwframe at ffffffff81a0009b

Thanks for this information.

I guess this is with COUNTER_MAX of 4? And it is slightly different to the
issue you found?

I will try to look into this next week (taking some time off this week).

Thanks,
Song



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux