On 2024-09-28 13:22, Jonas Oberhauser wrote:
Two more questions below:
Am 9/21/2024 um 6:42 PM schrieb Mathieu Desnoyers:
+#define NR_PERCPU_SLOTS_BITS 3
Have you measured any advantage of this multi-slot version vs a version
with just one normal slot and one emergency slot?
No, I have not. That being said, I am taking a minimalistic
approach that takes things even further in the "simple" direction
for what I will send as RFC against the Linux kernel:
there is just the one "emergency slot", irqs are disabled around
use of the HP slot, and promotion to refcount is done before
returning to the caller.
With just one normal slot, the normal slot version would always be zero,
and there'd be no need to increment etc., which might make the common
case (no conflict) faster.
The multi-slots allows preemption while holding the slot. It also allows
HP slots users to keep it longer without doing to refcount right away.
I even have a patch that dynamically adapts the scan depth (increase
by reader, decrease by synchronize, with an hysteresis) in my userspace
prototype. This splits the number of allocated slots from the scan
depth. But I keep that for later and will focus on the simple case
first (single HP slot, only used with irqoff).
Either way I recommend stress testing with just one normal slot to
increase the chance of conflict (and hence triggering corner cases)
during stress testing.
Good point.
+retry:
+ node = uatomic_load(node_p, CMM_RELAXED);
+ if (!node)
+ return false;
+ /* Use rseq to try setting current slot hp. Store B. */
+ if (rseq_load_cbne_store__ptr(RSEQ_MO_RELAXED, RSEQ_PERCPU_CPU_ID,
+ (intptr_t *) &slot->node, (intptr_t) NULL,
+ (intptr_t) node, cpu)) {
+ slot = &cpu_slots->slots[HPREF_EMERGENCY_SLOT];
+ use_refcount = true;
+ /*
+ * This may busy-wait for another reader using the
+ * emergency slot to transition to refcount.
+ */
+ caa_cpu_relax();
+ goto retry;
+ }
I'm not familiar with Linux' preemption model. Can this deadlock if a
low-interrupt-level thread is occupying the EMERGENCY slot and a
higher-interrupt-level thread is also trying to take it?
This is a userspace prototype. This will behave similarly to a userspace
spinlock in that case, which is not great in terms of CPU usage, but
should eventually unblock the waiter, unless it has a RT priority that
really prevents any progress from the emergency slot owner.
On my TODO list, I have a bullet about integrating with sys_futex to
block on wait, wake up on slot release. I would then use the wait/wakeup
code based on sys_futex already present in liburcu.
Thanks!
Mathieu
Best wishes,
jonas
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com