在 2020/8/19 下午3:55, Anshuman Khandual 写道: > > > On 08/19/2020 11:17 AM, Alex Shi wrote: >> pageblock_flags is used as long, since every pageblock_flags is just 4 >> bits, 'long' size will include 8(32bit machine) or 16 pageblocks' flags, >> that flag setting has to sync in cmpxchg with 7 or 15 other pageblock >> flags. It would cause long waiting for sync. >> >> If we could change the pageblock_flags variable as char, we could use >> char size cmpxchg, which just sync up with 2 pageblock flags. it could >> relief much false sharing in cmpxchg. > > Do you have numbers demonstrating claimed performance improvement > after this change ? > the performance data show in another email. LKP reported the arm6 has a bug on this patchset, since it has no cmpxchgb solution, so maybe let's fallback to cmpxchg on it. >From db3d97ba8cc5e206b440bd40a92ef6955ad86bc0 Mon Sep 17 00:00:00 2001 From: Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> Date: Tue, 18 Aug 2020 15:51:18 +0800 Subject: [PATCH v2 3/3] armv6: fix armv6 build issue Arm v6 can not simulate cmpxchg1 func, so we have to use cmpxchg4 on it. arm-linux-gnueabi-ld: mm/page_alloc.o: in function `set_pfnblock_flags_mask': (.text+0x32b4): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: (.text+0x32e0): undefined reference to `__bad_cmpxchg' arm-linux-gnueabi-ld: drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.o: in function `hw_atl_b0_get_mac_temp': hw_atl_b0.c:(.text+0x30fc): undefined reference to `__bad_udelay' Reported-by: kernel test robot <lkp@xxxxxxxxx> Signed-off-by: Alex Shi <alex.shi@xxxxxxxxxxxxxxxxx> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> Cc: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> Cc: Russell King <linux@xxxxxxxxxxxxxxx> Cc: linux-mm@xxxxxxxxx Cc: linux-kernel@xxxxxxxxxxxxxxx Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx --- mm/page_alloc.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 7da09d66233b..c09146a8946c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -517,7 +517,11 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags, { unsigned char *bitmap; unsigned long bitidx, byte_bitidx; +#ifdef CONFIG_CPU_V6 + unsigned long old_byte, byte; +#else unsigned char old_byte, byte; +#endif BUILD_BUG_ON(NR_PAGEBLOCK_BITS != BITS_PER_BYTE); BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits)); @@ -532,9 +536,18 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags, mask <<= bitidx; flags <<= bitidx; +#ifdef CONFIG_CPU_V6 + byte = (unsigned long)READ_ONCE(bitmap[byte_bitidx]); +#else byte = READ_ONCE(bitmap[byte_bitidx]); +#endif for (;;) { +#ifdef CONFIG_CPU_V6 + /* arm v6 has no cmpxchgb function, so still false sharing long word */ + old_byte = cmpxchg((unsigned long*)&bitmap[byte_bitidx], byte, (byte & ~mask) | flags); +#else old_byte = cmpxchg(&bitmap[byte_bitidx], byte, (byte & ~mask) | flags); +#endif if (byte == old_byte) break; byte = old_byte; -- 1.8.3.1