RE: Why return probes of some syscalls sometimes are not called?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Friday, March 10, 2017 11:51 AM Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> On Thu, 9 Mar 2017 18:00:54 +0100
> Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
> 
> > On Thu, 9 Mar 2017 13:44:38 +0000
> > "Dorau, Lukasz" <lukasz.dorau@xxxxxxxxx> wrote:
> >
> > > Hi,
> > >
> > > Could someone explain me why return probes of some syscalls (for example:
> futex, poll, epoll_wait) sometimes are not called?
> > >
> > > It can be reproduced using the following bash script:
> > > https://gist.github.com/ldorau/c439d9ec7635409a5016c42e3a9121ec
> > >
> > > Here are results gathered from 60 seconds test run on kernel 4.9.12 (Fedora
> 24):
> > >
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > >
> > > The whole log is attached below.
> > >
> > > Lukasz
> > >
> > > ---
> > > # ./test_kprobes.sh 60
> > >
> > > Will trace using following kprobe_events:
> > > r:kprobes/r_futex sys_futex
> > > p:kprobes/p_futex sys_futex
> > > r:kprobes/r_poll sys_poll
> > > p:kprobes/p_poll sys_poll
> > > r:kprobes/r_epoll_wait sys_epoll_wait
> > > p:kprobes/p_epoll_wait sys_epoll_wait
> > > r:kprobes/r_select sys_select
> > > p:kprobes/p_select sys_select
> > > r:kprobes/r_fork sys_fork
> > > p:kprobes/p_fork sys_fork
> > > r:kprobes/r_vfork sys_vfork
> > > p:kprobes/p_vfork sys_vfork
> > > r:kprobes/r_mmap sys_mmap
> > > p:kprobes/p_mmap sys_mmap
> > > r:kprobes/r_open sys_open
> > > p:kprobes/p_open sys_open
> > > r:kprobes/r_close sys_close
> > > p:kprobes/p_close sys_close
> > > r:kprobes/r_write sys_write
> > > p:kprobes/p_write sys_write
> > > r:kprobes/r_read sys_read
> > > p:kprobes/p_read sys_read
> > >
> > > Results (60 sec):
> > > futex:       p 56904    r 5489     (90% did not return (51415))
> > > poll:        p 43466    r 7703     (82% did not return (35763))
> > > epoll_wait:  p 73366    r 23551    (67% did not return (49815))
> > > select:      p 13355    r 13351    (0% did not return (4))
> > > fork:        p 0        r 0        (OK)
> > > vfork:       p 0        r 0        (OK)
> > > mmap:        p 4328     r 4328     (OK)
> > > open:        p 4579     r 4579     (OK)
> > > close:       p 7163     r 7163     (OK)
> > > write:       p 22769    r 22769    (OK)
> > > read:        p 40014    r 40014    (OK)
> > >
> >
> > OK, I just dumped kprobe_profile after the test,
> >
> > futex:       p 7775     r 698      (91% did not return (7077))
> > poll:        p 18409    r 3523     (80% did not return (14886))
> > epoll_wait:  p 5122     r 2389     (53% did not return (2733))
> > select:      p 5411     r 5286     (2% did not return (125))
> > fork:        p 0        r 0        (OK)
> > vfork:       p 0        r 0        (OK)
> > mmap:        p 179      r 179      (OK)
> > open:        p 921      r 921      (OK)
> > close:       p 1005     r 1005     (OK)
> > write:       p 5632     r 5631     (0% did not return (1))
> > read:        p 10879    r 10878    (0% did not return (1))
> >   r_futex                                                 1091               0
> >   p_futex                                                 8986               0
> >   r_poll                                                  4276               0
> >   p_poll                                                 20610               0
> >   r_epoll_wait                                            2849               0
> >   p_epoll_wait                                            5905               0
> >   r_select                                               10172               0
> >   p_select                                               10312               0
> >   r_fork                                                     0               0
> >   p_fork                                                     0               0
> >   r_vfork                                                    0               0
> >   p_vfork                                                    0               0
> >   r_mmap                                                  2987               0
> >   p_mmap                                                  2884               0
> >   r_open                                                  4002               0
> >   p_open                                                  3855               0
> >   r_close                                                 4262               0
> >   p_close                                                 4122               0
> >   r_write                                                14281               0
> >   p_write                                                14276               0
> >   r_read                                                 41494               0
> >   p_read                                                 41363               0
> >
> > OK, there are no miss-counted event. So at least kprobe and kretprobe working
> correctly.
> > I also tried to use static tracepoints for that.
> > https://gist.github.com/mhiramat/04d782deaf110a6b3391e2ecb923325e
> >
> > ----
> > $ sudo bash test_kprobes.sh 10
> > Disabling all current probes ...
> > Removing old probe points ...
> > Adding new probe points ...
> > Enabling new probe points ...
> > Clearing old trace log ...
> >
> > Tracing for 10 seconds ... done
> > Saving trace log ... done
> >
> > Results (10 sec):
> > futex:       p 9856     r 1258     (87% did not return (8598))
> > poll:        p 7531     r 4031     (46% did not return (3500))
> > epoll_wait:  p 1996     r 1617     (18% did not return (379))
> > select:      p 273      r 257      (5% did not return (16))
> > Results (10 sec):
> > futex:       p 9851     r 9851     (OK)
> > poll:        p 7531     r 7531     (OK)
> > epoll_wait:  p 1996     r 1996     (OK)
> > select:      p 273      r 273      (OK)
> > ----
> > Of course these events are collected in "different places" so
> > it could be different. But yeah, it seems sys_* function itself
> > dosen't return but jumps into ret_from_syscall etc.
> 
> Please try attached patch (that is my miss that forgot to accumulate missed
> count of kretprobe).
> 

Thanks for the patch. I ran two tests using the following script:
https://gist.github.com/ldorau/36d87640ff0659a483342d490b66f034
The results are following:

1) The first test - during 60-seconds test the system was almost idle
   (I did not touch the keyboard or mouse)

KProbes results (60 sec):
futex:       p 1373     r 422      (69% did not return (951))
poll:        p 3304     r 1575     (52% did not return (1729))
epoll_wait:  p 1062     r 943      (11% did not return (119))
select:      p 5978     r 5977     (0% did not return (1))
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8939     (OK)
exit:        p 1        r 0    
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 5        r 5        (OK)
execve:      p 4        r 4        (OK)

Tracepoints results (60 sec):
futex:       p 1372     r 1372     (OK)
poll:        p 3304     r 3303     (0% did not return (1))
epoll_wait:  p 1062     r 1062     (OK)
select:      p 5978     r 5978     (OK)
mmap:        p 264      r 264      (OK)
open:        p 1056     r 1056     (OK)
close:       p 1198     r 1198     (OK)
write:       p 4786     r 4786     (OK)
read:        p 8939     r 8940     (OK)
exit:        p 1        r 0    

2) The second test - during 60-seconds test I was normally using the system -
   - I launched an internet browser, edited a file...

KProbes results (60 sec):
futex:       p 35394    r 689      (98% did not return (34705))
poll:        p 23449    r 896      (96% did not return (22553))
epoll_wait:  p 43927    r 4973     (88% did not return (38954))
select:      p 4874     r 4873     (0% did not return (1))
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3678     (0% did not return (2))
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11769    (OK)
read:        p 22645    r 22647    (OK)
exit:        p 30       r 0        
fork:        p 0        r 0        (OK)
vfork:       p 0        r 0        (OK)
clone:       p 170      r 170      (OK)
execve:      p 2        r 2        (OK)

Tracepoints results (60 sec):
futex:       p 35359    r 35268    (0% did not return (91))
poll:        p 23448    r 23444    (0% did not return (4))
epoll_wait:  p 43926    r 43909    (0% did not return (17))
select:      p 4874     r 4874     (OK)
mmap:        p 3165     r 3165     (OK)
open:        p 3680     r 3827     (OK)
close:       p 6704     r 6704     (OK)
write:       p 11769    r 11770    (OK)
read:        p 22645    r 22648    (OK)
exit:        p 30       r 0        

Lukasz



> I've focused on futex and it showed there were some "misses" of kretprobe.
> 
> ----
> Disabling all current probes ...
> Removing old probe points ...
> Adding new probe points ...
> Enabling new probe points ...
> Clearing old trace log ...
> 
> Tracing for 10 seconds ... done
> Saving trace log ... done
> 
> Results (10 sec):
> futex:       p 190      r 74       (61% did not return (116))
> Results (10 sec):
> futex:       p 190      r 190      (OK)
> 
> $ cat /sys/kernel/debug/tracing/kprobe_profile
>   r_futex                                                  176             127
>   p_futex                                                  298               0
> -----
> 
> Actually, since the kretprobe is not well designed for the usecase
> probing on the sleep-able functions, it just fails if it invoked
> more than the number of CPUs in parallel.
> It is currently a limitation of kretprobe, but I'll solve it.
> 
> Thanks,
> 
> --
> Masami Hiramatsu <mhiramat@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-trace-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Development]     [Linux USB Development]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux