Hi, On 2019/1/23 10:28, Dominique Martinet wrote: > Sorry for the late reply, I had thought I had but it looks like it > didn't go out. > > Hou Tao wrote on Fri, Jan 18, 2019: >> Use inode->i_lock to protect i_size_write(), else i_size_read() in >> generic_fillattr() may loop infinitely in read_seqcount_begin() when >> multiple processes invoke v9fs_vfs_getattr() or v9fs_vfs_getattr_dotl() >> simultaneously under 32-bit SMP environment, and a soft lockup will be >> triggered as show below: > > I didn't notice before, but we have a couple of other places that call > i_size_write() in the 9p code, namely v9fs_write_end() and > v9fs_file_write_iter(). > I do not see what's special about these that would require not taking > the inode lock? > > write_end() has a comment that i_size cannot change under it because it > has the i_mutex, but it's obviously not sufficient given the stat2inode > code does not have it, so it needs to do the same dance as write_iter. > OK, will do that in v3 How about adding a helper as shown in the following lines ? static inline void v9fs_i_size_write(struct inode *inode, loff_t i_size) { spin_lock(&inode->i_lock); i_size_write(inode, i_size); spin_unlock(&inode->i_lock); } > >> watchdog: BUG: soft lockup - CPU#5 stuck for 22s! [stat:2217] >> Modules linked in: >> CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 >> Hardware name: Generic DT based system >> PC is at generic_fillattr+0x104/0x108 >> LR is at 0xec497f00 >> pc : [<802b8898>] lr : [<ec497f00>] psr: 200c0013 >> sp : ec497e20 ip : ed608030 fp : ec497e3c >> r10: 00000000 r9 : ec497f00 r8 : ed608030 >> r7 : ec497ebc r6 : ec497f00 r5 : ee5c1550 r4 : ee005780 >> r3 : 0000052d r2 : 00000000 r1 : ec497f00 r0 : ed608030 >> Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none >> Control: 10c5387d Table: ac48006a DAC: 00000051 >> CPU: 5 PID: 2217 Comm: stat Not tainted 5.0.0-rc1-00005-g7f702faf5a9e #4 >> Hardware name: Generic DT based system >> Backtrace: >> [<8010d974>] (dump_backtrace) from [<8010dc88>] (show_stack+0x20/0x24) >> [<8010dc68>] (show_stack) from [<80a1d194>] (dump_stack+0xb0/0xdc) >> [<80a1d0e4>] (dump_stack) from [<80109f34>] (show_regs+0x1c/0x20) >> [<80109f18>] (show_regs) from [<801d0a80>] (watchdog_timer_fn+0x280/0x2f8) >> [<801d0800>] (watchdog_timer_fn) from [<80198658>] (__hrtimer_run_queues+0x18c/0x380) >> [<801984cc>] (__hrtimer_run_queues) from [<80198e60>] (hrtimer_run_queues+0xb8/0xf0) >> [<80198da8>] (hrtimer_run_queues) from [<801973e8>] (run_local_timers+0x28/0x64) >> [<801973c0>] (run_local_timers) from [<80197460>] (update_process_times+0x3c/0x6c) >> [<80197424>] (update_process_times) from [<801ab2b8>] (tick_nohz_handler+0xe0/0x1bc) >> [<801ab1d8>] (tick_nohz_handler) from [<80843050>] (arch_timer_handler_virt+0x38/0x48) >> [<80843018>] (arch_timer_handler_virt) from [<80180a64>] (handle_percpu_devid_irq+0x8c/0x240) >> [<801809d8>] (handle_percpu_devid_irq) from [<8017ac20>] (generic_handle_irq+0x34/0x44) >> [<8017abec>] (generic_handle_irq) from [<8017b344>] (__handle_domain_irq+0x6c/0xc4) >> [<8017b2d8>] (__handle_domain_irq) from [<801022e0>] (gic_handle_irq+0x4c/0x88) >> [<80102294>] (gic_handle_irq) from [<80101a30>] (__irq_svc+0x70/0x98) >> [<802b8794>] (generic_fillattr) from [<8056b284>] (v9fs_vfs_getattr_dotl+0x74/0xa4) >> [<8056b210>] (v9fs_vfs_getattr_dotl) from [<802b8904>] (vfs_getattr_nosec+0x68/0x7c) >> [<802b889c>] (vfs_getattr_nosec) from [<802b895c>] (vfs_getattr+0x44/0x48) >> [<802b8918>] (vfs_getattr) from [<802b8a74>] (vfs_statx+0x9c/0xec) >> [<802b89d8>] (vfs_statx) from [<802b9428>] (sys_lstat64+0x48/0x78) >> [<802b93e0>] (sys_lstat64) from [<80101000>] (ret_fast_syscall+0x0/0x28) >> >> Reported-by: Xing Gaopeng <xingaopeng@xxxxxxxxxx> >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> >> --- >> v2: >> * move acquisition of i_lock into v9fs_stat2inode() >> * rename v9fs_stat2inode() to v9fs_stat2inode_flags() to accomodate >> v9fs_refresh_inode() which doesn't need to update i_size. > > As a nitpick I don't really like foo() vs foo_flags() as > foo-that-takes-extra-flags. > There are a few such examples in the kernel already but I think it does > not really convery information; it's better to have the base function > take flags and just use it, or if you want wrappers then just never > expose the flags but make a static _v9fs_stat2inode take flags, > v9fs_stat2inode behave as the old one and a new > v9fs_stat2inode_keepisize for the update with cache. > I'd personally go with the former are there only are four call sites. > I agree with you. I will add a new flags parameter to v9fs_stat2inode() and use it directly instead of creating inline wrappers around it. > Anyway, no strong feeling there, do as you'd like for this one > (including keeping the current way if you think it's better); the > other comments require another round-trip anyway :) > >> --- >> fs/9p/v9fs_vfs.h | 21 +++++++++++++++++++-- >> fs/9p/vfs_inode.c | 28 ++++++++++++++++------------ >> fs/9p/vfs_inode_dotl.c | 28 +++++++++++++++++----------- >> 3 files changed, 52 insertions(+), 25 deletions(-) >> >> diff --git a/fs/9p/v9fs_vfs.h b/fs/9p/v9fs_vfs.h >> index 5a0db6dec8d1..51b3e441b895 100644 >> --- a/fs/9p/v9fs_vfs.h >> +++ b/fs/9p/v9fs_vfs.h >> @@ -40,6 +40,9 @@ >> */ >> #define P9_LOCK_TIMEOUT (30*HZ) >> >> +/* flags for v9fs_stat2inode_flags() & v9fs_stat2inode_dotl_flags() */ >> +#define V9FS_STAT2INODE_KEEP_ISIZE 1 >> + >> extern struct file_system_type v9fs_fs_type; >> extern const struct address_space_operations v9fs_addr_operations; >> extern const struct file_operations v9fs_file_operations; >> @@ -61,8 +64,10 @@ int v9fs_init_inode(struct v9fs_session_info *v9ses, >> struct inode *inode, umode_t mode, dev_t); >> void v9fs_evict_inode(struct inode *inode); >> ino_t v9fs_qid2ino(struct p9_qid *qid); >> -void v9fs_stat2inode(struct p9_wstat *, struct inode *, struct super_block *); >> -void v9fs_stat2inode_dotl(struct p9_stat_dotl *, struct inode *); >> +void v9fs_stat2inode_flags(struct p9_wstat *stat, struct inode *inode, >> + struct super_block *sb, unsigned int flags); >> +void v9fs_stat2inode_dotl_flags(struct p9_stat_dotl *stat, struct inode *inode, >> + unsigned int flags); >> int v9fs_dir_release(struct inode *inode, struct file *filp); >> int v9fs_file_open(struct inode *inode, struct file *file); >> void v9fs_inode2stat(struct inode *inode, struct p9_wstat *stat); >> @@ -83,4 +88,16 @@ static inline void v9fs_invalidate_inode_attr(struct inode *inode) >> } >> >> int v9fs_open_to_dotl_flags(int flags); >> + >> +static inline void v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, >> + struct super_block *sb) >> +{ >> + v9fs_stat2inode_flags(stat, inode, sb, 0); >> +} >> + >> +static inline void v9fs_stat2inode_dotl(struct p9_stat_dotl *stat, >> + struct inode *inode) >> +{ >> + v9fs_stat2inode_dotl_flags(stat, inode, 0); >> +} >> #endif >> diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c >> index 85ff859d3af5..ca7d93184400 100644 >> --- a/fs/9p/vfs_inode.c >> +++ b/fs/9p/vfs_inode.c >> @@ -1166,16 +1166,17 @@ static int v9fs_vfs_setattr(struct dentry *dentry, struct iattr *iattr) >> } >> >> /** >> - * v9fs_stat2inode - populate an inode structure with mistat info >> + * v9fs_stat2inode_flags - populate an inode structure with mistat info >> * @stat: Plan 9 metadata (mistat) structure >> * @inode: inode to populate >> * @sb: superblock of filesystem >> + * @flags: control flags (e.g. V9FS_STAT2INODE_KEEP_ISIZE) >> * >> */ >> >> void >> -v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, >> - struct super_block *sb) >> +v9fs_stat2inode_flags(struct p9_wstat *stat, struct inode *inode, >> + struct super_block *sb, unsigned int flags) >> { >> umode_t mode; >> char ext[32]; >> @@ -1184,6 +1185,9 @@ v9fs_stat2inode(struct p9_wstat *stat, struct inode *inode, >> struct v9fs_session_info *v9ses = sb->s_fs_info; >> struct v9fs_inode *v9inode = V9FS_I(inode); >> >> + /* Protect the updates of i_size & cache_validity */ >> + spin_lock(&inode->i_lock); >> + > > You're still locking the whole function, which is what I complained > about in the previous patch. > First of all, in the KEEP_ISIZE path that lock isn't needed at all, > and it does not protect anything other than the i_size_write. > In my understanding v9fs_refresh_inode() uses i_lock to protect the update of cache_validity, so I lock the whole function to keep up with with the old behavior, but as you have commented below, the lock doesn't do any help, so I will only use i_lock to protect i_size_write(). > > I do see that you wrote in the comment this is meant to protect > cache_validity, but it does not solve the race I mentionned in the v1 > patch either - the race timing is from the moment the 9p getattr > request has been sent out, not just setting fields in here. > (e.g. thread1 sends getattr request, thread2 setattrs and invalidate > inode attr after request has been processed by server, thread1 receives > getattr and clears flag while attributes are still obsolete) >> Let's fix this in a proper separate patch and not pretend it's resolved > by this trivial locking when it isn't. I see. Thanks for you comments. > > For now, just locking around the i_size_write makes most sense. >> > Thanks, >