Re: Report a fuse deadlock scenario issue

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 11 Feb 2022 at 14:55, 陈立新 <clx428@xxxxxxx> wrote:
>
> Hi Miklos:
> I meet a dealock scenario on fuse. here are the 4 backtraces:
> PID: 301852  TASK: ffff80db78226c80  CPU: 93  COMMAND: "Thread-854"
> #0 [ffff000c1d88b9e0] __switch_to at ffff000080088738
> #1 [ffff000c1d88ba00] __schedule at ffff000080a06f48
> #2 [ffff000c1d88ba90] schedule at ffff000080a07620
> #3 [ffff000c1d88baa0] fuse_wait_on_page_writeback at ffff000001047418 [fuse]
> #4 [ffff000c1d88bb00] fuse_page_mkwrite at ffff000001047538 [fuse]
> #5 [ffff000c1d88bb40] do_page_mkwrite at ffff0000802cb77c
> #6 [ffff000c1d88bb90] do_fault at ffff0000802d1840
> #7 [ffff000c1d88bbd0] __handle_mm_fault at ffff0000802d3574
> #8 [ffff000c1d88bc90] handle_mm_fault at ffff0000802d37c0
> #9 [ffff000c1d88bcc0] do_page_fault at ffff000080a0ef94
> #10 [ffff000c1d88bdc0] do_translation_fault at ffff000080a0f32c
> #11 [ffff000c1d88bdf0] do_mem_abort at ffff0000800812cc
> #12 [ffff000c1d88bff0] el0_da at ffff000080083b20
>
> PID: 400127  TASK: ffff80d1a1c51f00  CPU: 91  COMMAND: "Thread-677"
> #0 [ffff000beb5e3a00] __switch_to at ffff000080088738
> #1 [ffff000beb5e3a20] __schedule at ffff000080a06f48
> #2 [ffff000beb5e3ab0] schedule at ffff000080a07620
> #3 [ffff000beb5e3ac0] fuse_wait_on_page_writeback at ffff000001047418 [fuse]
> #4 [ffff000beb5e3b20] fuse_page_mkwrite at ffff000001047538 [fuse]
> #5 [ffff000beb5e3b60] do_page_mkwrite at ffff0000802cb77c
> #6 [ffff000beb5e3bb0] do_wp_page at ffff0000802d0264
> #7 [ffff000beb5e3c00] __handle_mm_fault at ffff0000802d363c
> #8 [ffff000beb5e3cc0] handle_mm_fault at ffff0000802d37c0
> #9 [ffff000beb5e3cf0] do_page_fault at ffff000080a0ef94
> #10 [ffff000beb5e3df0] do_mem_abort at ffff0000800812cc
> #11 [ffff000beb5e3ff0] el0_da at ffff000080083b20
>
> PID: 178830  TASK: ffff80dc1704cd80  CPU: 64  COMMAND: "kworker/u259:11"
> #0 [ffff0000aab6b6f0] __switch_to at ffff000080088738
> #1 [ffff0000aab6b710] __schedule at ffff000080a06f48
> #2 [ffff0000aab6b7a0] schedule at ffff000080a07620
> #3 [ffff0000aab6b7b0] io_schedule at ffff00008012dbc4
> #4 [ffff0000aab6b7d0] __lock_page at ffff0000802854e0
> #5 [ffff0000aab6b870] write_cache_pages at ffff0000802987e8
> #6 [ffff0000aab6b990] fuse_writepages at ffff00000104ab6c [fuse]
> #7 [ffff0000aab6b9f0] do_writepages at ffff00008029b2e0
> #8 [ffff0000aab6ba70] __writeback_single_inode at ffff00008037f8b4
> #9 [ffff0000aab6bac0] writeback_sb_inodes at ffff000080380150
> #10 [ffff0000aab6bbd0] __writeback_inodes_wb at ffff0000803804c0
> #11 [ffff0000aab6bc20] wb_writeback at ffff000080380880
> #12 [ffff0000aab6bcd0] wb_workfn at ffff000080381470
> #13 [ffff0000aab6bdb0] process_one_work at ffff000080113428
> #14 [ffff0000aab6be00] worker_thread at ffff0000801136c0
> #15 [ffff0000aab6be70] kthread at ffff00008011ab60
>
> PID: 47324  TASK: ffff80db5a038000  CPU: 88  COMMAND: "Thread-2064"
> #0 [ffff000c2114b820] __switch_to at ffff000080088738
> #1 [ffff000c2114b840] __schedule at ffff000080a06f48
> #2 [ffff000c2114b8d0] schedule at ffff000080a07620
> #3 [ffff000c2114b8e0] io_schedule at ffff00008012dbc4
> #4 [ffff000c2114b900] __lock_page at ffff0000802854e0
> #5 [ffff000c2114b9a0] write_cache_pages at ffff0000802987e8
> #6 [ffff000c2114bac0] fuse_writepages at ffff00000104ab6c [fuse]
> #7 [ffff000c2114bb20] do_writepages at ffff00008029b2e0
> #8 [ffff000c2114bba0] __filemap_fdatawrite_range at ffff0000802883f8
> #9 [ffff000c2114bc60] file_write_and_wait_range at ffff0000802886f0
> #10 [ffff000c2114bca0] fuse_fsync_common at ffff0000010491d8 [fuse]
> #11 [ffff000c2114bd90] fuse_fsync at ffff00000104938c [fuse]
> #12 [ffff000c2114bdc0] vfs_fsync_range at ffff000080385938
> #13 [ffff000c2114bdf0] __arm64_sys_msync at ffff0000802dcf8c
> #14 [ffff000c2114be60] el0_svc_common at ffff000080097cbc
> #15 [ffff000c2114bea0] el0_svc_handler at ffff000080097df0
> #16 [ffff000c2114bff0] el0_svc at ffff000080084144
>
> The 4 threads write the same file, and deadlocked:
>   Thread 301852 gets the page 5 lock, and waiting on page 5 writeback is completed;
>   Thread 400127 gets the page 0 lock, and waiting on page 0 writeback is completed;
>   Thread 47324 is waiting page 5 lock, and already set page 0 - 4 to writeback;
>   Thread 178830 is waiting page 0 lock, and already set page 5 - 6 to writeback;

This last is not possible, because write_cache_pages() will always
return when it reached the end of the range and only the next
invocation will wrap around to the zero index page.  See this at the
end of write_cache_pages():

    if (wbc->range_cyclic && !done)
        done_index = 0;

Otherwise index will be monotonic increasing throughout a single
write_cache_pages() call.

That doesn't mean that there's no deadlock, this is pretty complex,
but there must be some other explanation.

Thanks,
Miklos




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux