Re: Why return probes of some syscalls sometimes are not called?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 9 Mar 2017 09:44:55 -0500
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

> On Thu, 9 Mar 2017 13:58:29 +0000 (UTC)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> 
> > ----- On Mar 9, 2017, at 8:44 AM, Dorau, Lukasz lukasz.dorau@xxxxxxxxx wrote:
> >   
> > > Hi,
> > > 
> > > Could someone explain me why return probes of some syscalls (for example: futex,
> > > poll, epoll_wait) sometimes are not called?
> > > 
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > > 
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora 24):
> > > 
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))    
> > 
> > Most likely scenario: those processes are still blocked on those
> > system calls when your tracing ends.  
> 
> This is very common but those numbers are very high. I doubt there's 51
> thousand threads blocked on a futex when tracing ended.
> 
> > 
> > AFAIU, another possible (less frequent) scenario: a process gets
> > killed with SIGKILL while blocked on the signal.
> >   
> 
> This could be.
> 
> > > 
> > > Results (60 sec):
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > select:      p 13355    r 13351    (0% did not return (4))  
> 
> All these are common system calls that tasks simply sleep on. But it
> would take a nasty kill to have them not return back to the program to
> clean up nicely. Another possibility is that these actually have another
> way out from the kernel that isn't caught by tracing. I'll take a look.
> 

BTW, what happens if you change your script to use the syscall
tracepoints instead? As syscalls have an entry and exit tracepoint. Do
the results change?

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux