Re: [PATCH v4 1/3] locking/rwsem: Remove arch specific rwsem files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 13, 2019 at 05:00:15PM -0500, Waiman Long wrote:
> As the generic rwsem-xadd code is using the appropriate acquire and
> release versions of the atomic operations, the arch specific rwsem.h
> files will not be that much faster than the generic code as long as the
> atomic functions are properly implemented. So we can remove those arch
> specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
> effort.
> 
> Currently, only x86, alpha and ia64 have implemented architecture
> specific fast paths. I don't have access to alpha and ia64 systems for
> testing, but they are legacy systems that are not likely to be updated
> to the latest kernel anyway.
> 
> By using a rwsem microbenchmark, the total locking rates on a 4-socket
> 56-core 112-thread x86-64 system before and after the patch were as
> follows (mixed means equal # of read and write locks):
> 
>                       Before Patch              After Patch
>    # of Threads  wlock   rlock   mixed     wlock   rlock   mixed
>    ------------  -----   -----   -----     -----   -----   -----
>         1        29,201  30,143  29,458    28,615  30,172  29,201
>         2         6,807  13,299   1,171     7,725  15,025   1,804
>         4         6,504  12,755   1,520     7,127  14,286   1,345
>         8         6,762  13,412     764     6,826  13,652     726
>        16         6,693  15,408     662     6,599  15,938     626
>        32         6,145  15,286     496     5,549  15,487     511
>        64         5,812  15,495      60     5,858  15,572      60
> 
> There were some run-to-run variations for the multi-thread tests. For
> x86-64, using the generic C code fast path seems to be a little bit
> faster than the assembly version with low lock contention.  Looking at
> the assembly version of the fast paths, there are assembly to/from C
> code wrappers that save and restore all the callee-clobbered registers
> (7 registers on x86-64). The assembly generated from the generic C
> code doesn't need to do that. That may explain the slight performance
> gain here.
> 
> The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
> with no code change as no other code other than those under
> kernel/locking needs to access the internal rwsem macros and functions.
> 
> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> ---
>  MAINTAINERS                     |   1 -
>  arch/alpha/include/asm/rwsem.h  | 211 -----------------------------------
>  arch/arm/include/asm/Kbuild     |   1 -
>  arch/arm64/include/asm/Kbuild   |   1 -
>  arch/hexagon/include/asm/Kbuild |   1 -
>  arch/ia64/include/asm/rwsem.h   | 172 -----------------------------
>  arch/powerpc/include/asm/Kbuild |   1 -
>  arch/s390/include/asm/Kbuild    |   1 -
>  arch/sh/include/asm/Kbuild      |   1 -
>  arch/sparc/include/asm/Kbuild   |   1 -
>  arch/x86/include/asm/rwsem.h    | 237 ----------------------------------------
>  arch/x86/lib/Makefile           |   1 -
>  arch/x86/lib/rwsem.S            | 156 --------------------------
>  arch/x86/um/Makefile            |   1 -
>  arch/xtensa/include/asm/Kbuild  |   1 -
>  include/asm-generic/rwsem.h     | 140 ------------------------
>  include/linux/rwsem.h           |   4 +-
>  kernel/locking/percpu-rwsem.c   |   2 +
>  kernel/locking/rwsem.h          | 130 ++++++++++++++++++++++
>  19 files changed, 133 insertions(+), 930 deletions(-)
>  delete mode 100644 arch/alpha/include/asm/rwsem.h
>  delete mode 100644 arch/ia64/include/asm/rwsem.h
>  delete mode 100644 arch/x86/include/asm/rwsem.h
>  delete mode 100644 arch/x86/lib/rwsem.S
>  delete mode 100644 include/asm-generic/rwsem.h

Looks like a nice cleanup, thanks:

Acked-by: Will Deacon <will.deacon@xxxxxxx>

Will



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux