Tue, Feb 07, 2023 at 12:54:16AM +0100, Udo Richter wrote:
Two-ended buffers are pretty good when used correctly, but nowadays
they have a small chance of triggering memory ordering issues, where it
is possible that written data to the buffer is still stuck in a distant
cache, while the updated head pointer is already known to other CPU
cores. To be absolutely safe, the head pointer would need atomic
read/write, forcing ordering with a memory barrier.
A nice explanation. Before C++11 (ISO/IEC 14882:2011), to my
understanding, multi-threaded programs or data races were not part of
the standard. Starting with C++11, data races actually are undefined
behavior, and not just "implementation defined". I would recommend the
following resources:
https://en.cppreference.com/w/cpp/atomic/memory_order
https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html
The latter only covers loads and stores, not read-modify-write like
std::atomic::fetch_add(), but it gives an idea how things look like in
different processors.
But in general the timing for such a bug is pretty narrow, and advanced
CPUs (esp. intel/amd) make sure that things get stored in order anyway.
With the IA-32 and AMD64 a.k.a. x86-64 ISA, the general expectation is
that implementations use Total Store Order (TSO). This is also how the
SPARC and the RISC-V RVTSO works.
However, on ARM (and POWER and RISC-V RVWMO), weak memory ordering is
the norm. On the ARM, some atomic operations would use retry loops, or
with the recent LSE (Large System Extensions), special instructions. It
should be remembered that there are VDR systems running on ARM, like the
Raspberry Pi.
Worst thing would also just be false data read, not a crash.
If that false data were a stale pointer that points to now-freed memory,
you could get a crash. Or you could get a hang or an assertion failure
catching an inconsistent state.
Somewhat related to this, I recently learned about Linux "restartable
sequences" https://lwn.net/Articles/883104/ which could be an
alternative to using mutexes in some lock-free data structures. I think
it could work for things where data would be "published" by updating
just one pointer at the end. I didn't think of an application of that in
VDR yet, and I am not sure if there is any. It is better to keep things
simple if the performance is acceptable.
Marko
_______________________________________________
vdr mailing list
vdr@xxxxxxxxxxx
https://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr