Tue, Feb 21, 2023 at 10:47:28AM +0100, Klaus Schmidinger wrote:
On 19.02.23 18:29, Patrick Lerda wrote:
...
I had definitively a few crashes related to this class. Thread safety
issues are often not easily reproducible. Is your environment 100%
reliable?
My VDR runs for weeks, even months 24/7 without problems.
I only restart it when I have a new version.
How many threads would be created or destroyed per day, in your typical
usage? If we assume a couple thousand such events per day, that would be
roughly a million events per year. It could take a thousand or a million
years before a low-probability crash could be reproduced in this way.
Even if it occurred, would you be guaranteed to thoroughly debug it?
With the next scheduled recording approaching in a few minutes?
I was thinking that it could be helpful to implement some automated
testing of restarts. I made a simple experiment, with a tuner stick
plugged into the USB port of an AMD64 laptop (ARM would be much better
for reproducing many race conditions), and no aerial cable:
mkdir /dev/shm/v
touch /dev/shm/v/sources.conf /dev/shm/v/channels.conf
i=0
while ./vdr --no-kbd -L. -Pskincurses -c /dev/shm/v -v /dev/shm/v
do
echo -n "$i"
i=$((i+1))
done
First, I thought of using an unpatched VDR. The easiest way to trigger
shutdown would seem to be SIGHUP. I did not figure out how to automate
the sending of that signal. Instead, I thought I would apply a crude
patch to the code, like this:
diff --git a/vdr.c b/vdr.c
index 1bdc51ab..b35c4aeb 100644
--- a/vdr.c
+++ b/vdr.c
@@ -1024,6 +1024,7 @@ int main(int argc, char *argv[])
dsyslog("SD_WATCHDOG ping");
}
#endif
+ EXIT(0);
// Handle channel and timer modifications:
{
// Channels and timers need to be stored in a consistent manner,
I did not check if this would actually exercise the thread creation and
shutdown. Maybe not sufficiently, since I do not see any skincurses
output on the screen.
Several such test loops against a vanilla VDR code base could be run
concurrently, using different DVB tuners, configuration directories, and
SVDRP ports. The test harness could issue HITK commands to randomly
switch channels, start and stop recordings, and finally restart VDR. As
long as the process keeps returning the expected exit status on restart,
the harness would restart it.
It should be possible to cover tens or hundreds of thousands VDR
restarts per day, and much more if the startup and shutdown logic was
streamlined to shorten any timeouts. In my environment, each iteration
with the above patch took about 3 seconds, which I find somewhat
excessive.
Should a problem be caught in this way, we should be able to get a core
dump of a crash, or we could attach GDB to a hung process to examine
what is going on.
Patrick, did you try reproducing any VDR problems under "rr record"
(https://rr-project.org/)? Debugging in "rr replay" would give access to
the exact sequence of events. For those race conditions that can be
reproduced in that way, debugging becomes almost trivial. (Just set some
data watchpoints and reverse-continue from the final state.)
Marko
_______________________________________________
vdr mailing list
vdr@xxxxxxxxxxx
https://www.linuxtv.org/cgi-bin/mailman/listinfo/vdr