Because of the heap-use-after-free race condition that was rather easily
reproducible with AddressSanitizer (-fsanitize=address), I thought that
I should finally try to learn to use ThreadSanitizer (TSAN,
-fsanitize=thread in GCC and clang).
https://clang.llvm.org/docs/ThreadSanitizer.html
Because VDR makes use of POSIX thread synchronization primitives, no
additional instrumentation via <sanitizer/tsan_interface.h> should be
necessary.
Before C++11 defined a memory model for multi-threaded applications,
semantics around shared data structures were rather unclear, and I would
guess that most multi-threaded pre-C++11 code bases would trip
ThreadSanitizer. Also, multi-threaded CPUs were rare in the early days,
and the Total Store Order of the x86 is very forgiving, compared to the
weak memory model of ARM (see
https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html for some
examples).
To silence one prominent source of TSAN warnings in the cThread
destructor, I applied the attached patch. It is not ideal, because
std::atomic defaults to std::memory_order_seq_cst while
std::memory_order_relaxed would likely suffice here.
Even after applying the first patch, a simple test with no DVB receiver
device and no valid data directory would still produce a couple of data
race reports (see the end of this message). I recorded a trace of such a
run with "rr record" (https://rr-project.org) and debugged it in "rr
replay". Unsurprisingly, I did not find actual evidence of a race
condition.
Finally, I figured out what is causing the first report:
cThread::description is not protected by cThread::mutex. Possibly, most
cThread data members (including cThread::active) should be protected by
cThread::mutex?
With both attached patches applied, the report of the first race will
disappear. The second race is apparently about some memory that is
allocated inside opendir(). I did not figure it out yet.
Related to this, cThread::Cancel() especially when invoked with
WaitSeconds=-1 looks problematic to me, and I see that VDR is invoking
pthread_detach() and never invoking pthread_join(). The second patch
includes an attempt to clean this up as well.
Both patches are just a proof of concept; I did not test them beyond the
simple failing-to-start VDR run under TSAN. Unfortunately, TSAN is not
available for my primary VDR hardware, running on 32-bit ARM.
With best regards,
Marko
vdr: error while reading '/var/lib/vdr/sources.conf'
vdr: error while reading '/var/lib/vdr/channels.conf'
vdr: no primary device found - using first device!
==================
WARNING: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) (pid=96847)
Write of size 8 at 0x7ffc7f773e60 by main thread:
#0 cThread::~cThread() /dev/shm/vdr/thread.c:249 (vdr+0x1d72b8)
#1 cEpgDataReader::~cEpgDataReader() /dev/shm/vdr/epg.h:236 (vdr+0xa956b)
#2 main /dev/shm/vdr/vdr.c:731 (vdr+0xa956b)
Previous read of size 8 at 0x7ffc7f773e60 by thread T2:
#0 cThread::StartThread(cThread*) /dev/shm/vdr/thread.c:293 (vdr+0x1d76d9)
Location is stack of main thread.
Location is global '<null>' at 0x000000000000 ([stack]+0x1fe60)
Thread T2 'epg data reader' (tid=96855, finished) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x5e686)
#1 cThread::Start() /dev/shm/vdr/thread.c:316 (vdr+0x1d6fe0)
#2 main /dev/shm/vdr/vdr.c:804 (vdr+0xaa477)
SUMMARY: ThreadSanitizer: data race on vptr (ctor/dtor vs virtual call) /dev/shm/vdr/thread.c:249 in cThread::~cThread()
==================
==================
WARNING: ThreadSanitizer: data race (pid=96847)
Write of size 8 at 0x7b04000005a0 by main thread:
#0 free ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:706 (libtsan.so.2+0x47e82)
#1 cString::~cString() /dev/shm/vdr/tools.c:1097 (vdr+0x1e52df)
#2 cxa_at_exit_wrapper ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:389 (libtsan.so.2+0x2dee3)
Previous read of size 8 at 0x7b04000005a0 by thread T1:
#0 opendir ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3271 (libtsan.so.2+0x4c641)
#1 cReadDir::cReadDir(char const*) /dev/shm/vdr/tools.c:1553 (vdr+0x1ea8bd)
#2 cVideoDirectoryScannerThread::ScanVideoDir(char const*, int, int) /dev/shm/vdr/recording.c:1439 (vdr+0x180620)
#3 cVideoDirectoryScannerThread::Action() /dev/shm/vdr/recording.c:1433 (vdr+0x180bfc)
#4 cThread::StartThread(cThread*) /dev/shm/vdr/thread.c:293 (vdr+0x1d76ea)
Thread T1 'video directory' (tid=96854, running) created by main thread at:
#0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:1001 (libtsan.so.2+0x5e686)
#1 cThread::Start() /dev/shm/vdr/thread.c:316 (vdr+0x1d6fe0)
#2 cRecordings::Update(bool) /dev/shm/vdr/recording.c:1554 (vdr+0x175387)
#3 main /dev/shm/vdr/vdr.c:788 (vdr+0xaa3f8)
SUMMARY: ThreadSanitizer: data race /dev/shm/vdr/tools.c:1097 in cString::~cString()
==================
ThreadSanitizer: reported 2 warnings
diff --git a/thread.h b/thread.h
index 16c4bd75..dd21346c 100644
--- a/thread.h
+++ b/thread.h
@@ -13,6 +13,9 @@
#include <pthread.h>
#include <stdio.h>
#include <sys/types.h>
+#if __cplusplus >= 201103L
+# include <atomic>
+#endif
typedef pid_t tThreadId;
@@ -79,8 +82,13 @@ public:
class cThread {
friend class cThreadLock;
private:
+#if __cplusplus >= 201103L
+ std::atomic<bool> active;
+ std::atomic<bool> running;
+#else
bool active;
bool running;
+#endif
pthread_t childTid;
tThreadId childThreadId;
cMutex mutex;
diff --git a/thread.c b/thread.c
index 93eb8c0d..5f76ca76 100644
--- a/thread.c
+++ b/thread.c
@@ -249,7 +249,11 @@ cThread::cThread(const char *Description, bool LowPriority)
cThread::~cThread()
{
Cancel(); // just in case the derived class didn't call it
+ if (childTid)
+ pthread_join(childTid, NULL);
+ mutex.Lock();
free(description);
+ mutex.Unlock();
}
void cThread::SetPriority(int Priority)
@@ -266,6 +270,7 @@ void cThread::SetIOPriority(int Priority)
void cThread::SetDescription(const char *Description, ...)
{
+ mutex.Lock();
free(description);
description = NULL;
if (Description) {
@@ -274,6 +279,7 @@ void cThread::SetDescription(const char *Description, ...)
description = strdup(cString::vsprintf(Description, ap));
va_end(ap);
}
+ mutex.Unlock();
}
void *cThread::StartThread(cThread *Thread)
@@ -291,8 +297,10 @@ void *cThread::StartThread(cThread *Thread)
Thread->SetIOPriority(7);
}
Thread->Action();
+ Thread->mutex.Lock();
if (Thread->description)
dsyslog("%s thread ended (pid=%d, tid=%d)", Thread->description, getpid(), Thread->childThreadId);
+ Thread->mutex.Unlock();
Thread->running = false;
Thread->active = false;
return NULL;
@@ -314,7 +322,6 @@ bool cThread::Start(void)
if (!active) {
active = running = true;
if (pthread_create(&childTid, NULL, (void *(*) (void *))&StartThread, (void *)this) == 0) {
- pthread_detach(childTid); // auto-reap
}
else {
LOG_ERROR;
@@ -364,6 +371,7 @@ void cThread::Cancel(int WaitSeconds)
esyslog("ERROR: %s thread %d won't end (waited %d seconds) - canceling it...", description ? description : "", childThreadId, WaitSeconds);
}
pthread_cancel(childTid);
+ pthread_join(childTid, NULL);
childTid = 0;
active = false;
}
_______________________________________________
vdr mailing list
vdr@xxxxxxxxxxx
https://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr