Re: Why return probes of some syscalls sometimes are not called?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



----- On Mar 9, 2017, at 9:44 AM, rostedt rostedt@xxxxxxxxxxx wrote:

> On Thu, 9 Mar 2017 13:58:29 +0000 (UTC)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> 
>> ----- On Mar 9, 2017, at 8:44 AM, Dorau, Lukasz lukasz.dorau@xxxxxxxxx wrote:
>> 
>> > Hi,
>> > 
>> > Could someone explain me why return probes of some syscalls (for example: futex,
>> > poll, epoll_wait) sometimes are not called?
>> > 
>> > It can be reproduced using the following bash script:
>> > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
>> > 
>> > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
>> > 
>> > futex:       p 56904    r 5489     (90% did not return (51415))
>> > poll:        p 43466    r 7703     (82% did not return (35763))
>> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
>> 
>> Most likely scenario: those processes are still blocked on those
>> system calls when your tracing ends.
> 
> This is very common but those numbers are very high. I doubt there's 51
> thousand threads blocked on a futex when tracing ended.
> 
>> 
>> AFAIU, another possible (less frequent) scenario: a process gets
>> killed with SIGKILL while blocked on the signal.
>> 
> 
> This could be.

Another more likely scenario is if a multithreaded process
has many threads blocked (e.g. on a futex), and one of the threads
exits cleanly or forks. I suspect the kernel will just tear down the
other threads without hitting syscall exit.

Thanks,

Mathieu

> 
>> > 
>> > Results (60 sec):
>> > futex:       p 56904    r 5489     (90% did not return (51415))
>> > poll:        p 43466    r 7703     (82% did not return (35763))
>> > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
>> > select:      p 13355    r 13351    (0% did not return (4))
> 
> All these are common system calls that tasks simply sleep on. But it
> would take a nasty kill to have them not return back to the program to
> clean up nicely. Another possibility is that these actually have another
> way out from the kernel that isn't caught by tracing. I'll take a look.
> 
> -- Steve
> 
> 
> 
>> > fork:        p 0        r 0        (OK)
>> > vfork:       p 0        r 0        (OK)
>> > mmap:        p 4328     r 4328     (OK)
>> > open:        p 4579     r 4579     (OK)
>> > close:       p 7163     r 7163     (OK)
>> > write:       p 22769    r 22769    (OK)
>> > read:        p 40014    r 40014    (OK)
>> > 
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
>> > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux