Re: [PATCH v2 2/2] trace-cmd: Fix a possible race condition and deadlock in trace-cmd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jun 5, 2019 at 2:21 PM Tzvetomir Stoyanov <tstoyanov@xxxxxxxxxx> wrote:
>
> When pipes are used for communication between trace-cmd main
> thread and per-cpu recorder threads, there is a possible race
> condition in stop_threads(), which can cause a deadlock between
> the main thread and cpu recorder thread:
>    In trace_stream_read(), the select() call can return 0 if threads
>    have no data to send. This will force stop_threads() to stop reading
>    the thread's pipes and enter a waitpid() loop, to wait for all threads
>    to be terminated. However, there is a case when some threads are still
>    flushing its data - tracecmd_flush_recording() tries a blocking write()
>    to the pipe. A dead lock appears - the cpu thread is blocked in write(),
>    as its buffer is full and no one is reading it. The main thread is blocked
>    in waitpid(), to wait the same thread to exit.
> The deadlock can be (randomly) observed with the command
> "trace-cmd profile -p function -F sleep 10"
>
> The proposed fix removes the select timeout, makes the call blocking,
> to ensure the threads are flushed its data before going in waitpid() loop.
>
> Signed-off-by: Tzvetomir Stoyanov <tstoyanov@xxxxxxxxxx>
> ---
>  tracecmd/trace-record.c | 3 +--
>  tracecmd/trace-stream.c | 6 ++++++
>  2 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/tracecmd/trace-record.c b/tracecmd/trace-record.c
> index 4523128..2f5fbd9 100644
> --- a/tracecmd/trace-record.c
> +++ b/tracecmd/trace-record.c
> @@ -626,7 +626,6 @@ static void delete_thread_data(void)
>
>  static void stop_threads(enum trace_type type)
>  {
> -       struct timeval tv = { 0, 0 };
>         int ret;
>         int i;
>
> @@ -643,7 +642,7 @@ static void stop_threads(enum trace_type type)
>         /* Flush out the pipes */
>         if (type & TRACE_TYPE_STREAM) {
>                 do {
> -                       ret = trace_stream_read(pids, recorder_threads, &tv);
> +                       ret = trace_stream_read(pids, recorder_threads, NULL);
>                 } while (ret > 0);
>         }
>
> diff --git a/tracecmd/trace-stream.c b/tracecmd/trace-stream.c
> index dad3466..3814a35 100644
> --- a/tracecmd/trace-stream.c
> +++ b/tracecmd/trace-stream.c
> @@ -92,6 +92,7 @@ int trace_stream_read(struct pid_record_data *pids, int nr_pids, struct timeval
>         struct pid_record_data *last_pid;
>         fd_set rfds;
>         int top_rfd = 0;
> +       int nr_fd;
>         int ret;
>         int i;
>
> @@ -119,18 +120,23 @@ int trace_stream_read(struct pid_record_data *pids, int nr_pids, struct timeval
>                 return 1;
>         }
>
> +       nr_fd = 0;
>         FD_ZERO(&rfds);
>
>         for (i = 0; i < nr_pids; i++) {
>                 /* Do not process closed pipes */
>                 if (pids[i].closed)
>                         continue;
> +               nr_fd++;
>                 if (pids[i].brass[0] > top_rfd)
>                         top_rfd = pids[i].brass[0];
>
>                 FD_SET(pids[i].brass[0], &rfds);
>         }
>
> +       if (!nr_fd)
> +               return 0;
> +
>         ret = select(top_rfd + 1, &rfds, NULL, NULL, tv);
>
>         if (ret > 0)
> --
> 2.21.0
>

Looks good to me.

Reviewed-by: Slavomir Kaslev <kaslevs@xxxxxxxxxx>

Cheers,

-- Slavi



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux