Re: [PATCH] fuse: Fix deadlock on open(O_TRUNC)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Miklos,

On Mon, Aug 16, 2021 at 02:39:14PM +0200, Miklos Szeredi wrote:
> On Fri, Aug 13, 2021 at 05:31:55PM +0800, Xie Yongji wrote:
> > The invalidate_inode_pages2() might be called with FUSE_NOWRITE
> > set in fuse_finish_open(), which can lead to deadlock in
> > fuse_launder_page().
> > 
> > To fix it, this tries to delay calling invalidate_inode_pages2()
> > until FUSE_NOWRITE is removed.
> 
> Thanks for the report and the patch.  I think it doesn't make sense to delay the
> invalidate_inode_pages2() call since the inode has been truncated in this case,
> there's no data worth writing out.
> 
> This patch replaces the invalidate_inode_pages2() with a truncate_pagecache()
> call.  This makes sense regardless of FOPEN_KEEP_CACHE or fc->writeback cache,
> so do it unconditionally.
> 
> Can you please check out the following patch?

I just saw a bug report where the obvious suspicious commit is this one.
Here's the relevant bits from the kernel log:

[ 3078.008319] BUG: Bad page state in process telegram-deskto  pfn:667339
[ 3078.008323] page:ffffcf63d99cce40 refcount:1 mapcount:0 mapping:ffff94009080d838 index:0x180
[ 3078.008329] fuse_file_aops [fuse] name:"domecamctl"
[ 3078.008330] flags: 0x17ffffc0000034(uptodate|lru|active)
[ 3078.008332] raw: 0017ffffc0000034 dead000000000100 dead000000000122 ffff94009080d838
[ 3078.008333] raw: 0000000000000180 0000000000000000 00000001ffffffff ffff93fbc7d10000
[ 3078.008333] page dumped because: page still charged to cgroup
[ 3078.008334] page->mem_cgroup:ffff93fbc7d10000
[ 3078.008334] bad because of flags: 0x30(lru|active)
[ 3078.008335] Modules linked in: <...>
[ 3078.008388] Supported: Yes
[ 3078.008390] CPU: 1 PID: 3738 Comm: telegram-deskto Kdump: loaded Not tainted 5.3.18-59.37-default #1 SLE15-SP3
[ 3078.008390] Hardware name: System manufacturer System Product Name/P8H77-M PRO, BIOS 0412 02/08/2012
[ 3078.008391] Call Trace:
[ 3078.008397]  dump_stack+0x66/0x8b
[ 3078.008399]  bad_page+0xc9/0x130
[ 3078.008401]  free_pcppages_bulk+0x443/0x870
[ 3078.008403]  free_unref_page_list+0x102/0x180
[ 3078.008406]  release_pages+0x301/0x400
[ 3078.008408]  __pagevec_release+0x2b/0x30
[ 3078.008409]  invalidate_inode_pages2_range+0x4d5/0x550
[ 3078.008412]  ? finish_wait+0x2f/0x60
[ 3078.008416]  fuse_finish_open+0x76/0xf0 [fuse]
[ 3078.008419]  fuse_open_common+0x105/0x110 [fuse]
[ 3078.008421]  ? fuse_open_common+0x110/0x110 [fuse]
[ 3078.008424]  do_dentry_open+0x204/0x3a0
[ 3078.008426]  path_openat+0x2fc/0x1520
[ 3078.008429]  ? mem_cgroup_commit_charge+0x5f/0x490
[ 3078.008431]  do_filp_open+0x9b/0x110
[ 3078.008433]  ? _copy_from_user+0x37/0x60
[ 3078.008435]  ? kmem_cache_alloc+0x18a/0x270
[ 3078.008436]  ? do_sys_open+0x1bd/0x260
[ 3078.008437]  do_sys_open+0x1bd/0x260
[ 3078.008440]  do_syscall_64+0x5b/0x1e0
[ 3078.008442]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 3078.008443] RIP: 0033:0x7f841f18b542
[ 3078.008444] Code: 89 45 b0 eb 93 0f 1f 00 44 89 55 9c e8 67 f5 ff ff 44 8b 55 9c 44 89 e2 4c 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 44 89 c7 89 45 9c e8 bb f5 ff ff 8b 45 9c
[ 3078.008445] RSP: 002b:00007ffdcf69f0e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000101
[ 3078.008446] RAX: ffffffffffffffda RBX: 00007f83ec4950f0 RCX: 00007f841f18b542
[ 3078.008446] RDX: 0000000000080000 RSI: 00007f83ec4950f0 RDI: 00000000ffffff9c
[ 3078.008447] RBP: 00007ffdcf69f150 R08: 0000000000000000 R09: ffffffffffffe798
[ 3078.008448] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000080000
[ 3078.008448] R13: 00007f83ec4950f0 R14: 00007ffdcf69f170 R15: 00007f83f798dcd0
[ 3078.008449] Disabling lock debugging due to kernel taint

This happens on an openSUSE kernel that contains a backport of commit
76224355db75 ("fuse: truncate pagecache on atomic_o_trunc") [1] but
there's also another report here [2] (seems to be the same issue), for a
5.15 (ubuntu) kernel.

Any idea about what could be happening here?  Could fuse_launder_page() be
causing this after this commit?  (I initially wondered if
fc->atomic_o_trunc could change between fuse_open_common() and the
fuse_finish_open() but couldn't see how.)

[1] https://bugzilla.opensuse.org/show_bug.cgi?id=1193691
[2] https://bugzilla.kernel.org/show_bug.cgi?id=215309

Cheers,
--
Luís


> Thanks,
> Miklos
> 
> ---
>  fs/fuse/file.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -198,12 +198,11 @@ void fuse_finish_open(struct inode *inod
>  	struct fuse_file *ff = file->private_data;
>  	struct fuse_conn *fc = get_fuse_conn(inode);
>  
> -	if (!(ff->open_flags & FOPEN_KEEP_CACHE))
> -		invalidate_inode_pages2(inode->i_mapping);
>  	if (ff->open_flags & FOPEN_STREAM)
>  		stream_open(inode, file);
>  	else if (ff->open_flags & FOPEN_NONSEEKABLE)
>  		nonseekable_open(inode, file);
> +
>  	if (fc->atomic_o_trunc && (file->f_flags & O_TRUNC)) {
>  		struct fuse_inode *fi = get_fuse_inode(inode);
>  
> @@ -211,10 +210,14 @@ void fuse_finish_open(struct inode *inod
>  		fi->attr_version = atomic64_inc_return(&fc->attr_version);
>  		i_size_write(inode, 0);
>  		spin_unlock(&fi->lock);
> +		truncate_pagecache(inode, 0);
>  		fuse_invalidate_attr(inode);
>  		if (fc->writeback_cache)
>  			file_update_time(file);
> +	} else if (!(ff->open_flags & FOPEN_KEEP_CACHE)) {
> +		invalidate_inode_pages2(inode->i_mapping);
>  	}
> +
>  	if ((file->f_mode & FMODE_WRITE) && fc->writeback_cache)
>  		fuse_link_write_file(file);
>  }



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux