Re: [RFC PATCH v2 3/4] hp: Implement Hazard Pointers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2024-10-04 23:25, Joel Fernandes wrote:
On Fri, Oct 4, 2024 at 2:29 PM Mathieu Desnoyers
<mathieu.desnoyers@xxxxxxxxxxxx> wrote:

This API provides existence guarantees of objects through Hazard
Pointers (HP). This minimalist implementation is specific to use
with preemption disabled, but can be extended further as needed.

Each HP domain defines a fixed number of hazard pointer slots (nr_cpus)
across the entire system.

Its main benefit over RCU is that it allows fast reclaim of
HP-protected pointers without needing to wait for a grace period.

It also allows the hazard pointer scan to call a user-defined callback
to retire a hazard pointer slot immediately if needed. This callback
may, for instance, issue an IPI to the relevant CPU.

There are a few possible use-cases for this in the Linux kernel:

   - Improve performance of mm_count by replacing lazy active mm by HP.
   - Guarantee object existence on pointer dereference to use refcount:
     - replace locking used for that purpose in some drivers,
     - replace RCU + inc_not_zero pattern,
   - rtmutex: Improve situations where locks need to be taken in
     reverse dependency chain order by guaranteeing existence of
     first and second locks in traversal order, allowing them to be
     locked in the correct order (which is reverse from traversal
     order) rather than try-lock+retry on nested lock.

References:

[1]: M. M. Michael, "Hazard pointers: safe memory reclamation for
      lock-free objects," in IEEE Transactions on Parallel and
      Distributed Systems, vol. 15, no. 6, pp. 491-504, June 2004
[ ... ]
---
Changes since v0:
- Remove slot variable from hp_dereference_allocate().
---
  include/linux/hp.h | 158 +++++++++++++++++++++++++++++++++++++++++++++
  kernel/Makefile    |   2 +-
  kernel/hp.c        |  46 +++++++++++++

Just a housekeeping comment, ISTR Linus looking down on adding bodies
of C code to header files (like hp_dereference_allocate). I understand
maybe the rationale is that the functions included are inlined. But do
all of them have to be inlined? Such headers also hurt code browsing
capabilities in code browsers like clangd. clangd doesn't understand
header files because it can't independently compile them -- it uses
the compiler to generate and extract the AST for superior code
browsing/completion.

Also have you looked at the benefits of inlining for hp.h?
hp_dereference_allocate() seems large enough that inlining may not
matter much, but I haven't compiled it and looked at the asm myself.

Here is a comparison in userspace:

* With "hp dereference allocate" inlined:

    test_hpref_benchmark (smp_mb)             nr_reads   1994298193 nr_writes     22293162 nr_ops   2016591355
    test_hpref_benchmark (barrier/membarrier) nr_reads  15208690879 nr_writes      1893785 nr_ops  15210584664

* With "hp dereference allocate" implemented as a function call:

    test_hpref_benchmark (smp_mb)             nr_reads   1558924716 nr_writes     14261028 nr_ops   1573185744
    test_hpref_benchmark (barrier/membarrier) nr_reads   5881131707 nr_writes      2005140 nr_ops   5883136847

So the overhead of the function call when using symmetric memory barriers
between hp allocate/hp scan is a 20% slowdown.

It's worse in the asymmetric barrier/membarrier case, introducing a 61%
slowdown.

Given that the overhead is noticeable, I am tempted to leave the hazard
pointer allocate/retire as inline functions.

About code browsers like clangd, I would recommend improving the tooling
rather than alter the design of the code based on current tooling
limitations.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux