On 2024/1/24 18:30, Christian Brauner wrote:
On Wed, Jan 24, 2024 at 03:52:16PM +0800, Baokun Li wrote:
On 2024/1/24 0:24, Christian Brauner wrote:
On Tue, Jan 23, 2024 at 03:07:50PM +0800, Baokun Li wrote:
On 2024/1/23 8:12, kernel test robot wrote:
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.misc
head: 297983dc9011461cba6278bfe03f4305c4a2caa0
commit: 4bbd51d0f0ad709c0f02c100439d6c9ba6811d4b [12/13] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()
config: i386-randconfig-015-20240123 (https://download.01.org/0day-ci/archive/20240123/202401230837.TXro0PHi-lkp@xxxxxxxxx/config)
compiler: ClangBuiltLinux clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240123/202401230837.TXro0PHi-lkp@xxxxxxxxx/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-kbuild-all/202401230837.TXro0PHi-lkp@xxxxxxxxx/
All errors (new ones prefixed by >>):
In file included from fs/netfs/buffered_read.c:10:
In file included from fs/netfs/internal.h:9:
In file included from include/linux/seq_file.h:12:
include/linux/fs.h:911:9: error: call to '__compiletime_assert_207' declared with 'error' attribute: Need native word sized stores/loads for atomicity.
911 | return smp_load_acquire(&inode->i_size);
| ^
include/asm-generic/barrier.h:206:2: note: expanded from macro 'smp_load_acquire'
206 | compiletime_assert_atomic_type(*p); \
| ^
include/linux/compiler_types.h:438:2: note: expanded from macro 'compiletime_assert_atomic_type'
438 | compiletime_assert(__native_word(t), \
| ^
include/linux/compiler_types.h:435:2: note: expanded from macro 'compiletime_assert'
435 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:423:2: note: expanded from macro '_compiletime_assert'
423 | __compiletime_assert(condition, msg, prefix, suffix)
| ^
include/linux/compiler_types.h:416:4: note: expanded from macro '__compiletime_assert'
416 | prefix ## suffix(); \
| ^
<scratch space>:38:1: note: expanded from here
38 | __compiletime_assert_207
| ^
1 error generated.
vim +911 include/linux/fs.h
874
875 void filemap_invalidate_lock_two(struct address_space *mapping1,
876 struct address_space *mapping2);
877 void filemap_invalidate_unlock_two(struct address_space *mapping1,
878 struct address_space *mapping2);
879
880
881 /*
882 * NOTE: in a 32bit arch with a preemptable kernel and
883 * an UP compile the i_size_read/write must be atomic
884 * with respect to the local cpu (unlike with preempt disabled),
885 * but they don't need to be atomic with respect to other cpus like in
886 * true SMP (so they need either to either locally disable irq around
887 * the read or for example on x86 they can be still implemented as a
888 * cmpxchg8b without the need of the lock prefix). For SMP compiles
889 * and 64bit archs it makes no difference if preempt is enabled or not.
890 */
891 static inline loff_t i_size_read(const struct inode *inode)
892 {
893 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
894 loff_t i_size;
895 unsigned int seq;
896
897 do {
898 seq = read_seqcount_begin(&inode->i_size_seqcount);
899 i_size = inode->i_size;
900 } while (read_seqcount_retry(&inode->i_size_seqcount, seq));
901 return i_size;
902 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
903 loff_t i_size;
904
905 preempt_disable();
906 i_size = inode->i_size;
907 preempt_enable();
908 return i_size;
909 #else
910 /* Pairs with smp_store_release() in i_size_write() */
> 911 return smp_load_acquire(&inode->i_size);
912 #endif
913 }
914
Sorry to cause this compilation error, I have not encountered this
before when testing compilation with gcc on x86 and arm64.
Is there a special config required for this, or is it just clang that
triggers this compile error?
The compile error seems to be that some architectural implementations
of smp_load_acquire() do not support pointers to long long data
types. Can't think of a good way to avoid this problem, any ideas
from linus and christian?
That code in i_size_{read,write}() is terrible to look at.
Pragmatically, we can probably just do READ_ONCE() + smp_rmb() and
smp_wmb() + WRITE_ONCE(). This guarantees the ordering and it works
(compiles) on 32bit as well. I think it's still possible that on 32bit
the READ_ONCE()/WRITE_ONCE() are compiled as two accesses. But I'm not
sure that this matters because iiuc that problem would've already been
there with the barrier in mm/filemap.c instead of here.
Thank you very much for your suggestion!
READ_ONCE()/WRITE_ONCE() now allows 64-bit accesses on 32-bit
architectures. Linus suggests smp_load_acquire()/smp_store_release()
to do the same in non-SMP case. I think this is better, what do you think?
Practically it doesn't matter what option we choose.
If we want to remove the compile time assert from smp_load_acquire() and
smp_store_release() we should do so from all architectures that define
it. So that's arm, arm64, loongarch, parisc, powerpc, riscv, s390,
sparc, x86.
Hello, Christian!
If CONFIG_SMP is not enabled in include/asm-generic/barrier.h,
then smp_load_acquire/smp_store_release is implemented as
READ_ONCE/READ_ONCE and barrier() and type checking.
READ_ONCE/READ_ONCE already checks the pointer type,
but then checks it more stringently outside, but here the
more stringent checking seems unnecessary, so it is removed
and only the type checking in READ_ONCE/READ_ONCE is kept
to avoid compilation errors.
When CONFIG_SMP is enabled on 32-bit architectures,
smp_load_acquire/smp_store_release is not called in i_size_read/write,
so there is no compilation problem. On 64-bit architectures, there
is no compilation problem because sizeof(long long) == sizeof(long),
regardless of whether CONFIG_SMP is enabled or not.
Also the arm64 implementation of __smp_load_acquire does not call
READ_ONCE but only uses compiletime_assert_atomic_type to check
the length of the data, so we can't remove compile-time assertions
from all architectures.
Otherwise I think this might be a bit messy. And we can do that but I'm
not sure that should be an accompanying patch to this change?
Linus, if you prefer to remove the compile time asserts from
smp_load_acquire() and smp_store_release() do you just want to apply a
patch to the tree directly? Otherwise maybe a patch separate from this
series might be better?
In the meantime or if we don't choose to remove it we should just be
able to do?
Thoughts?
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ed5966a70495..e46fe5d0dfc0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -890,8 +890,8 @@ void filemap_invalidate_unlock_two(struct address_space *mapping1,
*/
static inline loff_t i_size_read(const struct inode *inode)
{
-#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
loff_t i_size;
+#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
unsigned int seq;
do {
@@ -900,14 +900,16 @@ static inline loff_t i_size_read(const struct inode *inode)
} while (read_seqcount_retry(&inode->i_size_seqcount, seq));
return i_size;
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
- loff_t i_size;
preempt_disable();
i_size = inode->i_size;
preempt_enable();
return i_size;
#else
- return inode->i_size;
+
+ i_size = READ_ONCE(inode->i_size);
+ smp_rmb();
+ return i_size;
#endif
}
@@ -929,7 +931,8 @@ static inline void i_size_write(struct inode *inode, loff_t i_size)
inode->i_size = i_size;
preempt_enable();
#else
- inode->i_size = i_size;
+ smp_wmb();
+ WRITE_ONCE(inode->i_size, i_size);
#endif
}
Yes, using smp_rmb()/smp_wmb() would also solve the problem, but
the initial purpose of this patch was to replace smp_rmb() in filemap_read()
with the clearer smp_load_acquire/smp_store_release, and that's where
the community is going with this evolution. Later on, buffer and page/folio
will also switch to acquire/release, which is why I think Linus' suggestion
is better.
No offense. 😅
--
With Best Regards,
Baokun Li
.