> On Sat, 19 Oct 2019 07:50:53 +0000 > Allende Imanol <iallende@xxxxxxxxxx> wrote: > > > I am using Ftrace in a Ultra96 (arm64) 4.19.75-cip11 and it seems that > > I have a deadlock situation. I have a simple application executed as a > > child process from a python script. The child process reads from > > /dev/urandom and writes in /dev/zero. Ftrace is set only for this child > > process. > > Are you saying that ftrace is calling the blocked task? > > If you don't run ftrace, the task never gets stuck? Without FTrace I did not get to achieve the same stuck scenario. > > > > > > After thousands of executions of the child process with a high IRQ and > > CPU load (it varies in the number but it is quite reproducible) - the child > > process gets blocked. But the system still is running. > > > > The child process appears as "sleeping state" and if I try to do kill > > -18 $pid it does not change. > > > > In this case the child process is 4197 PID. > > > > # ps -aux | grep python > > root 1145 6.3 0.4 85448 9508 ? S<l 00:00 23:02 python3 test_client.py > > root 4197 0.0 0.4 85448 8256 ? S< 05:53 0:00 python3 test_client.py > > root 5279 0.0 0.0 2752 280 ttyPS0 S+ 06:02 0:00 grep python > > # cat /proc/4197/stack > > [<0>] __switch_to+0x94/0xd0 > > [<0>] futex_wait_queue_me+0xd0/0x150 > > [<0>] futex_wait+0x12c/0x200 > > [<0>] do_futex+0x304/0xc10 > > Looks like the task is blocked on a user space mutex (futex). > > > [<0>] __arm64_sys_futex+0x154/0x190 > > [<0>] el0_svc_common+0xa0/0x170 > > [<0>] el0_svc_handler+0x38/0x80 > > [<0>] el0_svc+0x8/0xc > > [<0>] 0xffffffffffffffff > > I could reproduce it in my Intel laptop (Debian 10 - 4.19 kernel): # cat /proc/21719/stack [<0>] futex_wait_queue_me+0xc1/0x120 [<0>] futex_wait+0x13f/0x240 [<0>] do_futex+0x3f7/0xbf0 [<0>] __x64_sys_futex+0x143/0x180 [<0>] do_syscall_64+0x53/0x110 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<0>] 0xffffffffffffffff > > > > The stack is more or less the same every time the block happens. > > > > If I kill -9 the child process the execution loop will continue with > > spawning the next child so it seems that the deadlock is really only > > in the child process - but do not know why. > > > > I am using function_graph traces and I added the funcgraph-proc, > > noirq-info and nofuncgraphs-irq options. I disable tracing after each > > iteration, and I clear the trace buffer and pid set also. > > > > I add the application that is executed as the child process: > > I don't understand what you meant by this. Is the below the code that > gets stuck? But the above shows a python task? The python script executes the read/write application as a child process > > > > > #define DEV1 "/dev/urandom" > > #define DEV2 "/dev/null" > > > > int main(int argc, char **argv) > > { > > unsigned char result; > > int fd1, fd2, ret; > > char res_str[10] = {0}; > > > > fd1 = open(DEV1, O_RDONLY); > > fd2 = open(DEV2, O_WRONLY); > > > > ret = read(fd1, &result, 1); > > sprintf(res_str, "%d", result); > > ret = write(fd2, res_str, strlen(res_str)); > > > > close(fd1); > > close(fd2); > > > > return ret; > > } > > Can you give the full reproducer, and explain the problem in a bit more > detail. I don't really see how ftrace is involved here, besides that > you are running it. This is a part of a bigger trace analysis tool that we are still developing. https://sil2.osadl.org/download/DB4SIL2_client.tar.gz It is ready to execute in Intel based computer. To execute the tool: # python3 test_client.py The tool starts executing hackbench, to disable it, just comment first 2 lines of test_client_code() There is a README with more detailed in explanation > > -- Steve > > > > > The preemption configuration is set to Preemptible kernel and the Ftrace > > config is set: > > CONFIG_FTRACE=y > > CONFIG_FUNCTION_TRACER=y > > CONFIG_FUNCTION_GRAPH_TRACER=y > > # CONFIG_PREEMPTIRQ_EVENTS is not set > > # CONFIG_IRQSOFF_TRACER is not set > > # CONFIG_PREEMPT_TRACER is not set > > # CONFIG_SCHED_TRACER is not set > > # CONFIG_HWLAT_TRACER is not set > > CONFIG_FTRACE_SYSCALLS=y > > # CONFIG_TRACER_SNAPSHOT is not set > > CONFIG_BRANCH_PROFILE_NONE=y > > # CONFIG_PROFILE_ANNOTATED_BRANCHES is not set > > # CONFIG_PROFILE_ALL_BRANCHES is not set > > CONFIG_STACK_TRACER=y > > # CONFIG_BLK_DEV_IO_TRACE is not set > > CONFIG_UPROBE_EVENTS=y > > CONFIG_PROBE_EVENTS=y > > CONFIG_DYNAMIC_FTRACE=y > > # CONFIG_FUNCTION_PROFILER is not set > > CONFIG_FTRACE_MCOUNT_RECORD=y > > # CONFIG_FTRACE_STARTUP_TEST is not set > > # CONFIG_HIST_TRIGGERS is not set > > # CONFIG_TRACEPOINT_BENCHMARK is not set > > # CONFIG_RING_BUFFER_BENCHMARK is not set > > # CONFIG_RING_BUFFER_STARTUP_TEST is not set > > # CONFIG_PREEMPTIRQ_DELAY_TEST is not set > > # CONFIG_TRACE_EVAL_MAP_FILE is not set > > CONFIG_TRACING_EVENTS_GPIO=y