Re: Non RT threads impact on RT thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hey Jordan-

(Quick administrative note: could you please not top post in your replies?)

On Fri, May 25, 2018 at 05:45:44PM +0200, Jordan Palacios wrote:
> On 25 May 2018 at 17:02, Julia Cartwright <julia@xxxxxx> wrote:
> > On Fri, May 25, 2018 at 03:38:47PM +0200, Jordan Palacios wrote:
> >> Hello,
> >>
> >> We managed to trace one of the failing cycles. The trace is here:
> >>
> >> https://pastebin.com/YJBrSQpJ
> >>
[..]
> > In other words: the traces show that this is a userspace problem, not a
> > kernel problem.  Solving this will require you to inspect your
> > application's locking.
> >
> > It may be helpful for you, in this effort, to identify the other thread
> > which eventually issues the FUTEX_WAKE (used for non-PI unlock
> > operation); the trace you linked only includes traces for CPU3, the
> > waker is on another CPU.  The remote wakeup occurs at timestamp
> > 12321.992480.
>
> Quick question. How do you know the wakeup occurs at timestamp 12321.992480?

The CPU has gone completely idle in the subsequent traces.

These traces are the first in the exit-from-idle path.  Given that the
CPU then schedules in your task, it's reasonable to assume that this CPU
exitted idle due to a remote wakeup from another CPU.

> At that timestamp the only thing we see is:
>
>   <idle>-0     [003] .n..1.. 12321.992480: rcu_idle_exit <-cpu_startup_entry
>   <idle>-0     [003] dn..1.. 12321.992480: rcu_eqs_exit_common.isra.46 <-rcu_idle_exit
>   <idle>-0     [003] .n..1.. 12321.992480: arch_cpu_idle_exit <-cpu_startup_entry
>   <idle>-0     [003] .n..1.. 12321.992480: atomic_notifier_call_chain <-arch_cpu_idle_exit
>
> And we also see this:
>
>   tuators_manager-1512  [003] ....1.. 12321.992495: sys_futex(uaddr: 7f41cc000020, op: 81, val: 1, utime: 7f41e4a60f19, uaddr2: 0, val3: 302e332e312e31)
>
> Can you explain us how is it related to the same futex, please? We see
> this call repeatedly across all the trace.

This is the same futex, as identified by the uaddr argument, but the
operation is 81, which (according to include/uapi/linux/futex.h is
FUTEX_WAKE | FUTEX_PRIVATE_FLAG).  This is likely an unlock operation.

This makes sense, when you think about it.  Your tuators_manager was
just woken up and completed it's pending FUTEX_WAIT op (it successfully
acquired the lock), then it executed it's critical section, now it's
releasing the lock.  This is why you then see this FUTEX_WAKE.

> We'll try tracing the other threads to pick who issues the FUTEX_WAKE.

Identifying them is just the most obvious and easiest starting point.
You'll need to figure out whether or not it makes sense for your
application to be sharing locks between high priority and low-priority
threads.  If it is necessary, then you will at the very least need to
make use of PI mutexes.

Good luck,
   Julia
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux