Re:Re: Report a fuse deadlock scenario issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



At 2022-02-17 23:39:17, "Miklos Szeredi" <miklos@xxxxxxxxxx> wrote:
>On Fri, 11 Feb 2022 at 14:55, 陈立新 <clx428@xxxxxxx> wrote:
>>
>> Hi Miklos:
>> I meet a dealock scenario on fuse. here are the 4 backtraces:
>> PID: 301852  TASK: ffff80db78226c80  CPU: 93  COMMAND: "Thread-854"
>> #0 [ffff000c1d88b9e0] __switch_to at ffff000080088738
>> #1 [ffff000c1d88ba00] __schedule at ffff000080a06f48
>> #2 [ffff000c1d88ba90] schedule at ffff000080a07620
>> #3 [ffff000c1d88baa0] fuse_wait_on_page_writeback at ffff000001047418 [fuse]
>> #4 [ffff000c1d88bb00] fuse_page_mkwrite at ffff000001047538 [fuse]
>> #5 [ffff000c1d88bb40] do_page_mkwrite at ffff0000802cb77c
>> #6 [ffff000c1d88bb90] do_fault at ffff0000802d1840
>> #7 [ffff000c1d88bbd0] __handle_mm_fault at ffff0000802d3574
>> #8 [ffff000c1d88bc90] handle_mm_fault at ffff0000802d37c0
>> #9 [ffff000c1d88bcc0] do_page_fault at ffff000080a0ef94
>> #10 [ffff000c1d88bdc0] do_translation_fault at ffff000080a0f32c
>> #11 [ffff000c1d88bdf0] do_mem_abort at ffff0000800812cc
>> #12 [ffff000c1d88bff0] el0_da at ffff000080083b20
>>
>> PID: 400127  TASK: ffff80d1a1c51f00  CPU: 91  COMMAND: "Thread-677"
>> #0 [ffff000beb5e3a00] __switch_to at ffff000080088738
>> #1 [ffff000beb5e3a20] __schedule at ffff000080a06f48
>> #2 [ffff000beb5e3ab0] schedule at ffff000080a07620
>> #3 [ffff000beb5e3ac0] fuse_wait_on_page_writeback at ffff000001047418 [fuse]
>> #4 [ffff000beb5e3b20] fuse_page_mkwrite at ffff000001047538 [fuse]
>> #5 [ffff000beb5e3b60] do_page_mkwrite at ffff0000802cb77c
>> #6 [ffff000beb5e3bb0] do_wp_page at ffff0000802d0264
>> #7 [ffff000beb5e3c00] __handle_mm_fault at ffff0000802d363c
>> #8 [ffff000beb5e3cc0] handle_mm_fault at ffff0000802d37c0
>> #9 [ffff000beb5e3cf0] do_page_fault at ffff000080a0ef94
>> #10 [ffff000beb5e3df0] do_mem_abort at ffff0000800812cc
>> #11 [ffff000beb5e3ff0] el0_da at ffff000080083b20
>>
>> PID: 178830  TASK: ffff80dc1704cd80  CPU: 64  COMMAND: "kworker/u259:11"
>> #0 [ffff0000aab6b6f0] __switch_to at ffff000080088738
>> #1 [ffff0000aab6b710] __schedule at ffff000080a06f48
>> #2 [ffff0000aab6b7a0] schedule at ffff000080a07620
>> #3 [ffff0000aab6b7b0] io_schedule at ffff00008012dbc4
>> #4 [ffff0000aab6b7d0] __lock_page at ffff0000802854e0
>> #5 [ffff0000aab6b870] write_cache_pages at ffff0000802987e8
>> #6 [ffff0000aab6b990] fuse_writepages at ffff00000104ab6c [fuse]
>> #7 [ffff0000aab6b9f0] do_writepages at ffff00008029b2e0
>> #8 [ffff0000aab6ba70] __writeback_single_inode at ffff00008037f8b4
>> #9 [ffff0000aab6bac0] writeback_sb_inodes at ffff000080380150
>> #10 [ffff0000aab6bbd0] __writeback_inodes_wb at ffff0000803804c0
>> #11 [ffff0000aab6bc20] wb_writeback at ffff000080380880
>> #12 [ffff0000aab6bcd0] wb_workfn at ffff000080381470
>> #13 [ffff0000aab6bdb0] process_one_work at ffff000080113428
>> #14 [ffff0000aab6be00] worker_thread at ffff0000801136c0
>> #15 [ffff0000aab6be70] kthread at ffff00008011ab60
>>
>> PID: 47324  TASK: ffff80db5a038000  CPU: 88  COMMAND: "Thread-2064"
>> #0 [ffff000c2114b820] __switch_to at ffff000080088738
>> #1 [ffff000c2114b840] __schedule at ffff000080a06f48
>> #2 [ffff000c2114b8d0] schedule at ffff000080a07620
>> #3 [ffff000c2114b8e0] io_schedule at ffff00008012dbc4
>> #4 [ffff000c2114b900] __lock_page at ffff0000802854e0
>> #5 [ffff000c2114b9a0] write_cache_pages at ffff0000802987e8
>> #6 [ffff000c2114bac0] fuse_writepages at ffff00000104ab6c [fuse]
>> #7 [ffff000c2114bb20] do_writepages at ffff00008029b2e0
>> #8 [ffff000c2114bba0] __filemap_fdatawrite_range at ffff0000802883f8
>> #9 [ffff000c2114bc60] file_write_and_wait_range at ffff0000802886f0
>> #10 [ffff000c2114bca0] fuse_fsync_common at ffff0000010491d8 [fuse]
>> #11 [ffff000c2114bd90] fuse_fsync at ffff00000104938c [fuse]
>> #12 [ffff000c2114bdc0] vfs_fsync_range at ffff000080385938
>> #13 [ffff000c2114bdf0] __arm64_sys_msync at ffff0000802dcf8c
>> #14 [ffff000c2114be60] el0_svc_common at ffff000080097cbc
>> #15 [ffff000c2114bea0] el0_svc_handler at ffff000080097df0
>> #16 [ffff000c2114bff0] el0_svc at ffff000080084144
>>
>> The 4 threads write the same file, and deadlocked:
>>   Thread 301852 gets the page 5 lock, and waiting on page 5 writeback is completed;
>>   Thread 400127 gets the page 0 lock, and waiting on page 0 writeback is completed;
>>   Thread 47324 is waiting page 5 lock, and already set page 0 - 4 to writeback;
>>   Thread 178830 is waiting page 0 lock, and already set page 5 - 6 to writeback;
>
>This last is not possible, because write_cache_pages() will always
>return when it reached the end of the range and only the next
>invocation will wrap around to the zero index page.  See this at the
>end of write_cache_pages():
>
>    if (wbc->range_cyclic && !done)
>        done_index = 0;
I use the kernel version is 4.19.36, which it has no 64081362e8ff4587b4554087f3cfc73d3e0a4cd7 mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock patch.
I think this patch can fix this deadlock.
thanks.
>
>Otherwise index will be monotonic increasing throughout a single
>write_cache_pages() call.
>
>That doesn't mean that there's no deadlock, this is pretty complex,
>but there must be some other explanation.
>
>Thanks,
>Miklos




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux