Hi Sebastian, > > > > > > The kernel version 4.1.37-rt43 is working fine, the kernel > > > > > > versions > > > > > > 4.4.39-rt50 and > > > > > 4.8.11-rt7 show the same strange behavior as 4.9.0-rt1. > > > > > > Something on the way between 4.1.37-rt43 and 4.4.39-rt50 seems > > > > > > to cause the > > > > > trouble. > > > > > > > > > > can you check if one of the earlier v4.4-RT releases (maybe > > > > > start with > > > > > -rt2 or -rt1) also shows this behaviour? > > > > > If not, would you have a testcase? > > > > I tested with 4.4.0-rt2. > > > > Here this issue occurs very rarely. I ran the gdb for about 30 > > > > times, and here I got one hit. > > > > With the other kernel versions I got the issue in 20-40% of the cases. > > > > > > > > I will try out the other 4.4.0 releases... > > > > > > I ran more tests. The kernel 4.4.1-rt4 is working fairly fine (only > > > about one hit every 30 runs) but the kernel 4.4.1-rt5 is causing the > > > issue very > > often. > > > > and 4.1.37-rt43 shows not SIGSTOPS in 100 runs? > That's right. > I used 4.1.37-rt43 and ran the test for 200 times. There was no single issue. I have now a simple executable where I can reproduce a very similar issue. It might be related to the one I see as it effects also gdb debugging. The simple executable (see below or in the attachment) runs perfectly without gdb. However, using kernel version 4.4.1-rt6 and also with 4.9.4-rt2 it very often does not run properly using gdb. In kernel version 4.1.37-rt43 it always runs properly in gdb. Whenever the issue occurs, the gdb does not reach the end of the executable. I have to stop the gdb with with CTRL-C. Here is the gdb output of kernel 4.9.4-rt7 (gdb) info thr shows Id Target Id Frame * 1 Thread 0x7ffff7fdf700 (LWP 5383) "gdb-issue" 0x00007ffff79bd6bd in pthread_join (threadid=140737343743744, thread_return=0x0) at pthread_join.c:90 2 Thread 0x7ffff7616700 (LWP 5387) "gdb-issue" 0x00007ffff76cee4b in __GI___waitpid (pid=5540, stat_loc=stat_loc@entry=0x7ffff7615d50, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29 (gdb) thr 2 (gdb) bt shows #0 0x00007ffff76cee4b in __GI___waitpid (pid=5540, stat_loc=stat_loc@entry=0x7ffff7615d50, options=options@entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29 #1 0x00007ffff765608b in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148 #2 0x0000555555554df8 in thread_func (arg=0x5555557560c0 <thread_data>) at gdb-issue.c:54 #3 0x00007ffff79bc464 in start_thread (arg=0x7ffff7616700) at pthread_create.c:333 #4 0x00007ffff76ff9df in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105 Sometimes it works fine, however in about 20-30% of the runs, the gdb session does not work properly. Does this give you any hint? Thanks for the support. Regards Mathias --------------- BEGIN of C CODE ------------- #define _GNU_SOURCE #include <sched.h> #include <stdio.h> #include <string.h> #include <pthread.h> #include <stdint.h> #include <stdlib.h> #include <unistd.h> #include <stdbool.h> #include <sys/eventfd.h> #include <sys/mman.h> static volatile bool terminated; typedef struct { pthread_t thread; int id; int trigger_fd; } thread_data_t; void *thread_func(void *arg) { int rc; thread_data_t *thr = arg; int loop; { struct sched_param param; memset(¶m, 0, sizeof(param)); param.sched_priority = 60; rc = pthread_setschedparam(pthread_self(), SCHED_FIFO, ¶m); if (rc) { fprintf(stderr, "pthread_setschedparam %m\n"); exit(1); } /* Run each thread on a separate CPU core */ cpu_set_t set; CPU_ZERO(&set); CPU_SET(thr->id, &set); rc = sched_setaffinity(0, sizeof(set), &set); if (rc) { fprintf(stderr, "sched_setaffinity %m\n"); exit(1); } } for (loop=0; !terminated ; loop++) { uint64_t u64; int rc = read(thr->trigger_fd, &u64, 8); if (rc > 0) { /* printf("thread %i loop %i\n", thr->id, loop); */ system("lspci -xxx > /dev/null"); } else { fprintf(stderr, "read %m\n"); } } printf("End of thread %i\n", thr->id); return NULL; } #define NO_THREADS 4 static thread_data_t thread_data[NO_THREADS]; int main(int argc, char *argv[]) { int i; int rc; /* Init */ mlockall(MCL_CURRENT | MCL_FUTURE); for (i=0; i<NO_THREADS; i++) { thread_data[i].id = i; rc = thread_data[i].trigger_fd = eventfd(0,0); if (rc < 0) { fprintf(stderr, "Error creating eventfd: %m\n"); return 1; } } for (i=0; i<NO_THREADS; i++) { rc = pthread_create(&thread_data[i].thread, NULL, thread_func, &thread_data[i]); if (rc) { fprintf(stderr, "Error creating thread: %m\n"); return 1; } } /* Loop */ int j; int max = 100; for (j=0; j<=max; j++) { struct timespec ts = { 0, 10000000 }; clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, NULL); if (j == max) { terminated = true; } for (i=0; i<NO_THREADS; i++) { uint64_t u64 = 1; write(thread_data[i].trigger_fd, &u64, 8); } } /* Cleanup */ for (i=0; i<NO_THREADS; i++) { pthread_join(thread_data[i].thread, NULL); printf("Thread %i joined\n", i); close(thread_data[i].trigger_fd); } return 0; } --------------- END of C CODE -------------
Attachment:
gdb-issue.tgz
Description: gdb-issue.tgz