[PATCH v2 0/3] fs: make the i_size_read/write helpers be smp_load_acquire/store_release()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



V1->V2:
  Add patch 3 to fix an error when compiling code for 32-bit architectures
  without CONFIG_SMP enabled.

This patchset follows the Linus suggestion to make the i_size_read/write
helpers be smp_load_acquire/store_release(), after which the extra smp_rmb
in filemap_read() is no longer needed, so it is removed. And remove the
extra type checking in smp_load_acquire/smp_store_release under the
!CONFIG_SMP case to avoid compilation errors.

Functional tests were performed and no new problems were found.

Here are the results of unixbench tests based on 6.7.0-next-20240118 on
arm64, with some degradation in single-threading and some optimization in
multi-threading, but overall the impact is not significant.

### 72 CPUs in system; running 1 parallel copy of tests
System Benchmarks Index Values        |   base  | patched |  cmp   |
--------------------------------------|---------|---------|--------|
Dhrystone 2 using register variables  | 3635.06 | 3596.3  | -1.07% |
Double-Precision Whetstone            | 808.58  | 808.58  | 0.00%  |
Execl Throughput                      | 623.52  | 618.1   | -0.87% |
File Copy 1024 bufsize 2000 maxblocks | 1715.82 | 1668.58 | -2.75% |
File Copy 256 bufsize 500 maxblocks   | 1320.98 | 1250.16 | -5.36% |
File Copy 4096 bufsize 8000 maxblocks | 2639.36 | 2488.48 | -5.72% |
Pipe Throughput                       | 869.06  | 872.3   | 0.37%  |
Pipe-based Context Switching          | 106.26  | 117.22  | 10.31% |
Process Creation                      | 247.72  | 246.74  | -0.40% |
Shell Scripts (1 concurrent)          | 1234.98 | 1226    | -0.73% |
Shell Scripts (8 concurrent)          | 6893.96 | 6210.46 | -9.91% |
System Call Overhead                  | 493.72  | 494.28  | 0.11%  |
--------------------------------------|---------|---------|--------|
Total                                 | 1003.92 | 989.58  | -1.43% |

### 72 CPUs in system; running 72 parallel copy of tests
System Benchmarks Index Values        |   base    |  patched  |  cmp   |
--------------------------------------|-----------|-----------|--------|
Dhrystone 2 using register variables  | 260471.88 | 258065.04 | -0.92% |
Double-Precision Whetstone            | 58212.32  | 58219.3   | 0.01%  |
Execl Throughput                      | 6954.7    | 7444.08   | 7.04%  |
File Copy 1024 bufsize 2000 maxblocks | 64244.74  | 64618.24  | 0.58%  |
File Copy 256 bufsize 500 maxblocks   | 89933.8   | 87026.38  | -3.23% |
File Copy 4096 bufsize 8000 maxblocks | 79808.14  | 81916.42  | 2.64%  |
Pipe Throughput                       | 62174.38  | 62389.74  | 0.35%  |
Pipe-based Context Switching          | 27239.28  | 27887.24  | 2.38%  |
Process Creation                      | 3551.28   | 3800.54   | 7.02%  |
Shell Scripts (1 concurrent)          | 19212.26  | 20749.34  | 8.00%  |
Shell Scripts (8 concurrent)          | 20842.02  | 21958.12  | 5.36%  |
System Call Overhead                  | 35328.24  | 35451.68  | 0.35%  |
--------------------------------------|-----------|-----------|--------|
Total                                 | 35592.42  | 36450.36  | 2.41%  |

Baokun Li (3):
  fs: make the i_size_read/write helpers be
    smp_load_acquire/store_release()
  Revert "mm/filemap: avoid buffered read/write race to read
    inconsistent data"
  asm-generic: remove extra type checking in acquire/release for non-SMP
    case

 include/asm-generic/barrier.h |  2 --
 include/linux/fs.h            | 10 ++++++++--
 mm/filemap.c                  |  9 ---------
 3 files changed, 8 insertions(+), 13 deletions(-)

-- 
2.31.1





[Index of Archives]     [Linux Kernel]     [Kernel Newbies]     [x86 Platform Driver]     [Netdev]     [Linux Wireless]     [Netfilter]     [Bugtraq]     [Linux Filesystems]     [Yosemite Discussion]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]

  Powered by Linux