Am 19.11.24 um 08:21 schrieb Marko Mäkelä:
Fri, Nov 15, 2024 at 07:43:35PM +0100, schorpp wrote:
bt full / thread apply all bt
#2 0x081330f4 in cRemote::Get (WaitMs=10, UnknownCode=0x0) at remote.c:194
MutexLock = {mutex = 0x822feb0, locked = true}
#3 0x080f225b in cInterface::GetKey (this=0x9ee02c8, Wait=<optimized out>) at interface.c:41
No locals.
#4 0x080aeda3 in main (argc=0, argv=<optimized out>) at vdr.c:1066
Is this really the only thread?
No, some break in to show that debug symbols are available now.
"thread apply all backtrace" should show the stack traces of all threads. There is alsoagain "info threads". This seems to be the main thread, waiting for input from the remote control.
I know, thanks.
Strange is the the bug does no more occur using the vdr-dbg exe?
Maybe the vdr-dbg has been built with different options, such as without some optimizations. It could affect the timing enough, if this is related to some race condition.
Maybe.
If gdb is attached to the vdr process the bug does not occur, if it is not attached the SIGFPE occours intermittently on first recording start, it does not occour on subsequent recording starts.
I'll set ulimit -c unlimit in runvdr script to get a core dump for gdb if available for SIGFPE.
I've upgraded the kernel to latest longterm stable 4.19.324 kernel which catches the bug:
$ grep trap /var/log/syslog
Nov 22 16:25:17 vdr2 kernel: [ 272.072890] traps: recording[5468] trap divide error ip:8136f6a sp:9f9da200 error:0 in vdr[8048000+186000]
Now I've attached gdb and wait if it occours again and to get ab full bt.
HA! I've got this bitch of intermittent bug finally:
Program received signal SIGFPE, Arithmetic exception.
[Switching to Thread 0xad0ffb40 (LWP 27522)]
0x08136f6a in cFrameDetector::Analyze (this=0x9ade480, Data=<optimized
out>, Length=296852) at remux.c:1567
1567 uint32_t Delta = ptsValues[0] /
(framesPerPayloadUnit + parser->IFrameTemporalReferenceOffset());
(gdb) bt
#0 0x08136f6a in cFrameDetector::Analyze (this=0x9ade480,
Data=<optimized out>, Length=296852) at remux.c:1567
#1 0x081295b2 in cRecorder::Action (this=0x9c41290) at recorder.c:127
#2 0x08165e13 in cThread::StartThread (Thread=0x9c413b4) at thread.c:262
#3 0xb7f1fd97 in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#4 0xb7c39dfe in clone () from /lib/i386-linux-gnu/libc.so.6
(gdb)
(gdb) bt full
#0 0x08136f6a in cFrameDetector::Analyze (this=0x9ade480,
Data=<optimized out>, Length=296852) at remux.c:1567
Delta = <optimized out>
Pid = <optimized out>
Handled = <optimized out>
Processed = <optimized out>
#1 0x081295b2 in cRecorder::Action (this=0x9c41290) at recorder.c:127
Count = <optimized out>
r = 296852
b = 0x8fd38368 "G^\025\064\ap\177G\025\341~\274"
t = {begin = 102034543}
InfoWritten = <optimized out>
FirstIframeSeen = <optimized out>
#2 0x08165e13 in cThread::StartThread (Thread=0x9c413b4) at thread.c:262
No locals.
#3 0xb7f1fd97 in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
No symbol table info available.
#4 0xb7c39dfe in clone () from /lib/i386-linux-gnu/libc.so.6
No symbol table info available.
But there's no data available for gdb about
(framesPerPayloadUnit + parser->IFrameTemporalReferenceOffset())
Aussume it is a divide by zero exception?
But in which case can this be zero?
y
tom