Sorry for the late reply, I had thought I had but it looks like it didn't go out. Hou Tao wrote on Fri, Jan 18, 2019: > Use inode->i_lock to protect i_size_write(), else i_size_read() in > generic_fillattr() may loop infinitely in read_seqcount_begin() when > multiple processes invoke v9fs_vfs_getattr() or v9fs_vfs_getattr_dotl() > simultaneously under 32-bit SMP environment, and a soft lockup will be > triggered as show below: I didn't notice before, but we have a couple of other places that call i_size_write() in the 9p code, namely v9fs_write_end() and v9fs_file_write_iter(). I do not see what's special about these that would require not taking the inode lock? write_end() has a comment that i_size cannot change under it because it has the i_mutex, but it's obviously not sufficient given the stat2inode code does not have it, so it needs to do the same dance as write_iter. > watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:2217] > Modules linked in: > CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 > Hardware name: Generic DT based system > PC is at generic_fillattr+0x104/0x108 > LR is at 0xec497f00 > pc : [<802b8898>] lr : [<ec497f00>] psr: 200c0013 > sp : ec497e20 ip : ed608030 fp : ec497e3c > r10: 00000000 r9 : ec497f00 r8 : ed608030 > r7 : ec497ebc r6 : ec497f00 r5 : ee5c1550 r4 : ee005780 > r3 : 0000052d r2 : 00000000 r1 : ec497f00 r0 : ed608030 > Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none > Control: 10c5387d Table: ac48006a DAC: 00000051 > CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 > Hardware name: Generic DT based system > Backtrace: > [<8010d974>] (dump_backtrace) from [<8010dc88>] (show_stack+0x20/0x24) > [<8010dc68>] (show_stack) from [<80a1d194>] (dump_stack+0xb0/0xdc) > [<80a1d0e4>] (dump_stack) from [<80109f34>] (show_regs+0x1c/0x20) > [<80109f18>] (show_regs) from [<801d0a80>] (watchdog_timer_fn+0x280/0x2f8) > [<801d0800>] (watchdog_timer_fn) from [<80198658>] (__hrtimer_run_queues+0x18c/0x380) > [<801984cc>] (__hrtimer_run_queues) from [<80198e60>] (hrtimer_run_queues+0xb8/0xf0) > [<80198da8>] (hrtimer_run_queues) from [<801973e8>] (run_local_timers+0x28/0x64) > [<801973c0>] (run_local_timers) from [<80197460>] (update_process_times+0x3c/0x6c) > [<80197424>] (update_process_times) from [<801ab2b8>] (tick_nohz_handler+0xe0/0x1bc) > [<801ab1d8>] (tick_nohz_handler) from [<80843050>] (arch_timer_handler_virt+0x38/0x48) > [<80843018>] (arch_timer_handler_virt) from [<80180a64>] (handle_percpu_devid_irq+0x8c/0x240) > [<801809d8>] (handle_percpu_devid_irq) from [<8017ac20>] (generic_handle_irq+0x34/0x44) > [<8017abec>] (generic_handle_irq) from [<8017b344>] (__handle_domain_irq+0x6c/0xc4) > [<8017b2d8>] (__handle_domain_irq) from [<801022e0>] (gic_handle_irq+0x4c/0x88) > [<80102294>] (gic_handle_irq) from [<80101a30>] (__irq_svc+0x70/0x98) > [<802b8794>] (generic_fillattr) from [<8056b284>] (v9fs_vfs_getattr_dotl+0x74/0xa4) > [<8056b210>] (v9fs_vfs_getattr_dotl) from [<802b8904>] (vfs_getattr_nosec+0x68/0x7c) > [<802b889c>] (vfs_getattr_nosec) from [<802b895c>] (vfs_getattr+0x44/0x48) > [<802b8918>] (vfs_getattr) from [<802b8a74>] (vfs_statx+0x9c/0xec) > [<802b89d8>] (vfs_statx) from [<802b9428>] (sys_lstat64+0x48/0x78) > [<802b93e0>] (sys_lstat64) from [<80101000>] (ret_fast_syscall+0x0/0x28) > > Reported-by: Xing Gaopeng <xingaopeng@xxxxxxxxxx> > Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > --- > v2: > * move acquisition of i_lock into v9fs_stat2inode() > * rename v9fs_stat2inode() to v9fs_stat2inode_flags() to accomodate > v9fs_refresh_inode() which doesn't need to update i_size. As a nitpick I don't really like foo() vs foo_flags() as foo-that-takes-extra-flags. There are a few such examples in the kernel already but I think it does not really convery information; it's better to have the base function take flags and just use it, or if you want wrappers then just never expose the flags but make a static _v9fs_stat2inode take flags, v9fs_stat2inode behave as the old one and a new v9fs_stat2inode_keepisize for the update with cache. I'd personally go with the former are there only are four call sites. Anyway, no strong feeling there, do as you'd like for this one (including keeping the current way if you think it's better); the other comments require another round-trip anyway :) > --- > fs/9p/v9fs_vfs.h | 21 +++++++++++++++++++-- > fs/9p/vfs_inode.c | 28 ++++++++++++++++------------ > fs/9p/vfs_inode_dotl.c | 28 +++++++++++++++++----------- > 3 files changed, 52 insertions(+), 25 deletions(-) > > diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h > index 5a0db6dec8d1..51b3e441b895 100644 > --- a/fs/9p/v9fs_vfs.h > +++ b/fs/9p/v9fs_vfs.h > @@ -40,6 +40,9 @@ > */ > #define P9_LOCK_TIMEOUT (30*HZ) > > +/* flags for v9fs_stat2inode_flags() & v9fs_stat2inode_dotl_flags() */ > +#define V9FS_STAT2INODE_KEEP_ISIZE 1 > + > extern struct file_system_type v9fs_fs_type; > extern const struct address_space_operations v9fs_addr_operations; > extern const struct file_operations v9fs_file_operations; > @@ -61,8 +64,10 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses, > struct inode *inode, umode_t mode, dev_t); > void v9fs_evict_inode(struct inode *inode); > ino_t v9fs_qid2ino(struct p9_qid *qid); > -void v9fs_stat2inode(struct p9_wstat *, struct inode *, struct super_block *); > -void v9fs_stat2inode_dotl(struct p9_stat_dotl *, struct inode *); > +void v9fs_stat2inode_flags(struct p9_wstat *stat, struct inode *inode, > + struct super_block *sb, unsigned int flags); > +void v9fs_stat2inode_dotl_flags(struct p9_stat_dotl *stat, struct inode *inode, > + unsigned int flags); > int v9fs_dir_release(struct inode *inode, struct file *filp); > int v9fs_file_open(struct inode *inode, struct file *file); > void v9fs_inode2stat(struct inode *inode, struct p9_wstat *stat); > @@ -83,4 +88,16 @@ static inline void v9fs_invalidate_inode_attr(struct inode *inode) > } > > int v9fs_open_to_dotl_flags(int flags); > + > +static inline void v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, > + struct super_block *sb) > +{ > + v9fs_stat2inode_flags(stat, inode, sb, 0); > +} > + > +static inline void v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, > + struct inode *inode) > +{ > + v9fs_stat2inode_dotl_flags(stat, inode, 0); > +} > #endif > diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c > index 85ff859d3af5..ca7d93184400 100644 > --- a/fs/9p/vfs_inode.c > +++ b/fs/9p/vfs_inode.c > @@ -1166,16 +1166,17 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr) > } > > /** > - * v9fs_stat2inode - populate an inode structure with mistat info > + * v9fs_stat2inode_flags - populate an inode structure with mistat info > * @stat: Plan 9 metadata (mistat) structure > * @inode: inode to populate > * @sb: superblock of filesystem > + * @flags: control flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE) > * > */ > > void > -v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, > - struct super_block *sb) > +v9fs_stat2inode_flags(struct p9_wstat *stat, struct inode *inode, > + struct super_block *sb, unsigned int flags) > { > umode_t mode; > char ext[32]; > @@ -1184,6 +1185,9 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, > struct v9fs_session_info *v9ses = sb->s_fs_info; > struct v9fs_inode *v9inode = V9FS_I(inode); > > + /* Protect the updates of i_size & cache_validity */ > + spin_lock(&inode->i_lock); > + You're still locking the whole function, which is what I complained about in the previous patch. First of all, in the KEEP_ISIZE path that lock isn't needed at all, and it does not protect anything other than the i_size_write. I do see that you wrote in the comment this is meant to protect cache_validity, but it does not solve the race I mentionned in the v1 patch either - the race timing is from the moment the 9p getattr request has been sent out, not just setting fields in here. (e.g. thread1 sends getattr request, thread2 setattrs and invalidate inode attr after request has been processed by server, thread1 receives getattr and clears flag while attributes are still obsolete) Let's fix this in a proper separate patch and not pretend it's resolved by this trivial locking when it isn't. For now, just locking around the i_size_write makes most sense. Thanks, -- Dominique