On Wed, Jan 24, 2024 at 03:52:16PM +0800, Baokun Li wrote: > On 2024/1/24 0:24, Christian Brauner wrote: > > On Tue, Jan 23, 2024 at 03:07:50PM +0800, Baokun Li wrote: > > > On 2024/1/23 8:12, kernel test robot wrote: > > > > tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.misc > > > > head: 297983dc9011461cba6278bfe03f4305c4a2caa0 > > > > commit: 4bbd51d0f0ad709c0f02c100439d6c9ba6811d4b [12/13] fs: make the i_size_read/write helpers be smp_load_acquire/store_release() > > > > config: i386-randconfig-015-20240123 (https://download.01.org/0day-ci/archive/20240123/202401230837.TXro0PHi-lkp@xxxxxxxxx/config) > > > > compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18) > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240123/202401230837.TXro0PHi-lkp@xxxxxxxxx/reproduce) > > > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > > > > the same patch/commit), kindly add following tags > > > > | Reported-by: kernel test robot <lkp@xxxxxxxxx> > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/202401230837.TXro0PHi-lkp@xxxxxxxxx/ > > > > > > > > All errors (new ones prefixed by >>): > > > > > > > > In file included from fs/netfs/buffered_read.c:10: > > > > In file included from fs/netfs/internal.h:9: > > > > In file included from include/linux/seq_file.h:12: > > > > > > include/linux/fs.h:911:9: error: call to '__compiletime_assert_207' declared with 'error' attribute: Need native word sized stores/loads for atomicity. > > > > 911 | return smp_load_acquire(&inode->i_size); > > > > | ^ > > > > include/asm-generic/barrier.h:206:2: note: expanded from macro 'smp_load_acquire' > > > > 206 | compiletime_assert_atomic_type(*p); \ > > > > | ^ > > > > include/linux/compiler_types.h:438:2: note: expanded from macro 'compiletime_assert_atomic_type' > > > > 438 | compiletime_assert(__native_word(t), \ > > > > | ^ > > > > include/linux/compiler_types.h:435:2: note: expanded from macro 'compiletime_assert' > > > > 435 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > > > > | ^ > > > > include/linux/compiler_types.h:423:2: note: expanded from macro '_compiletime_assert' > > > > 423 | __compiletime_assert(condition, msg, prefix, suffix) > > > > | ^ > > > > include/linux/compiler_types.h:416:4: note: expanded from macro '__compiletime_assert' > > > > 416 | prefix ## suffix(); \ > > > > | ^ > > > > <scratch space>:38:1: note: expanded from here > > > > 38 | __compiletime_assert_207 > > > > | ^ > > > > 1 error generated. > > > > > > > > > > > > vim +911 include/linux/fs.h > > > > > > > > 874 > > > > 875 void filemap_invalidate_lock_two(struct address_space *mapping1, > > > > 876 struct address_space *mapping2); > > > > 877 void filemap_invalidate_unlock_two(struct address_space *mapping1, > > > > 878 struct address_space *mapping2); > > > > 879 > > > > 880 > > > > 881 /* > > > > 882 * NOTE: in a 32bit arch with a preemptable kernel and > > > > 883 * an UP compile the i_size_read/write must be atomic > > > > 884 * with respect to the local cpu (unlike with preempt disabled), > > > > 885 * but they don't need to be atomic with respect to other cpus like in > > > > 886 * true SMP (so they need either to either locally disable irq around > > > > 887 * the read or for example on x86 they can be still implemented as a > > > > 888 * cmpxchg8b without the need of the lock prefix). For SMP compiles > > > > 889 * and 64bit archs it makes no difference if preempt is enabled or not. > > > > 890 */ > > > > 891 static inline loff_t i_size_read(const struct inode *inode) > > > > 892 { > > > > 893 #if BITS_PER_LONG==32 && defined(CONFIG_SMP) > > > > 894 loff_t i_size; > > > > 895 unsigned int seq; > > > > 896 > > > > 897 do { > > > > 898 seq = read_seqcount_begin(&inode->i_size_seqcount); > > > > 899 i_size = inode->i_size; > > > > 900 } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); > > > > 901 return i_size; > > > > 902 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) > > > > 903 loff_t i_size; > > > > 904 > > > > 905 preempt_disable(); > > > > 906 i_size = inode->i_size; > > > > 907 preempt_enable(); > > > > 908 return i_size; > > > > 909 #else > > > > 910 /* Pairs with smp_store_release() in i_size_write() */ > > > > > 911 return smp_load_acquire(&inode->i_size); > > > > 912 #endif > > > > 913 } > > > > 914 > > > Sorry to cause this compilation error, I have not encountered this > > > before when testing compilation with gcc on x86 and arm64. > > > Is there a special config required for this, or is it just clang that > > > triggers this compile error? > > > > > > The compile error seems to be that some architectural implementations > > > of smp_load_acquire() do not support pointers to long long data > > > types. Can't think of a good way to avoid this problem, any ideas > > > from linus and christian? > > That code in i_size_{read,write}() is terrible to look at. > > Pragmatically, we can probably just do READ_ONCE() + smp_rmb() and > > smp_wmb() + WRITE_ONCE(). This guarantees the ordering and it works > > (compiles) on 32bit as well. I think it's still possible that on 32bit > > the READ_ONCE()/WRITE_ONCE() are compiled as two accesses. But I'm not > > sure that this matters because iiuc that problem would've already been > > there with the barrier in mm/filemap.c instead of here. > Thank you very much for your suggestion! > > READ_ONCE()/WRITE_ONCE() now allows 64-bit accesses on 32-bit > architectures. Linus suggests smp_load_acquire()/smp_store_release() > to do the same in non-SMP case. I think this is better, what do you think? Practically it doesn't matter what option we choose. If we want to remove the compile time assert from smp_load_acquire() and smp_store_release() we should do so from all architectures that define it. So that's arm, arm64, loongarch, parisc, powerpc, riscv, s390, sparc, x86. Otherwise I think this might be a bit messy. And we can do that but I'm not sure that should be an accompanying patch to this change? Linus, if you prefer to remove the compile time asserts from smp_load_acquire() and smp_store_release() do you just want to apply a patch to the tree directly? Otherwise maybe a patch separate from this series might be better? In the meantime or if we don't choose to remove it we should just be able to do? Thoughts? diff --git a/include/linux/fs.h b/include/linux/fs.h index ed5966a70495..e46fe5d0dfc0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -890,8 +890,8 @@ void filemap_invalidate_unlock_two(struct address_space *mapping1, */ static inline loff_t i_size_read(const struct inode *inode) { -#if BITS_PER_LONG==32 && defined(CONFIG_SMP) loff_t i_size; +#if BITS_PER_LONG==32 && defined(CONFIG_SMP) unsigned int seq; do { @@ -900,14 +900,16 @@ static inline loff_t i_size_read(const struct inode *inode) } while (read_seqcount_retry(&inode->i_size_seqcount, seq)); return i_size; #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION) - loff_t i_size; preempt_disable(); i_size = inode->i_size; preempt_enable(); return i_size; #else - return inode->i_size; + + i_size = READ_ONCE(inode->i_size); + smp_rmb(); + return i_size; #endif } @@ -929,7 +931,8 @@ static inline void i_size_write(struct inode *inode, loff_t i_size) inode->i_size = i_size; preempt_enable(); #else - inode->i_size = i_size; + smp_wmb(); + WRITE_ONCE(inode->i_size, i_size); #endif }