Re: improve ps performance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




----- Original Message -----
> On 09/20/2014 03:15 AM, Dave Anderson wrote:
> >
> > ----- Original Message -----
> >> Hello Pan,
> >>
> >> I've updated the patch I attached yesterday with a change that
> >> caches the most-recent tgid search result.  From ~70% to ~90% of
> >> the time, either the last tgid entry or the very next one in the
> >> tgid_array is the one being searched for, so it's not necessary
> >> to call bsearch() every time.  "help -t" will show the cache-hit
> >> statistics.
> >>
> >> Thanks,
> >>    Dave
> > Hello Pan,
> >
> > This patch as written needs to be made less restrictive for use
> > on a live system.
> >
> > When running on a live system that has many tasks constantly
> > forking/exec'ing, the "ps" command may occasionally fail like so:
> >
> >    crash>  ps
> >         PID    PPID  CPU       TASK        ST  %MEM     VSZ    RSS  COMM
> >          0      0   0  ffffffff81c13440  RU   0.0       0      0
> >          [swapper/0]
> >          0      0   1  ffff88021282d330  RU   0.0       0      0
> >          [swapper/1]
> >    >      0      0   2  ffff88021282dac0  RU   0.0       0      0
> >    >      [swapper/2]
> >          0      0   3  ffff88021282e250  RU   0.0       0      0
> >          [swapper/3]
> >          1      0   1  ffff880212828000  IN   0.0   50140   3120  systemd
> >          2      0   3  ffff880212828790  IN   0.0       0      0
> >          [kthreadd]
> >    ... [ cut ] ...
> >       7578  27670   0  ffff8801f45e3c80  DE   0.0       0      0  cc
> >       7622  27668   1  ffff880210ee3c80  ZO   0.0       0      0  info
> >       7629  27667   1  ffff8801075bd330  DE   0.0       0      0  rev
> >       7631  27680   0  ffff8801075bf170  ZO   0.0       0      0  printenv
> >       7635  27685   3  ffff880108bbe9e0  ZO   0.0       0      0  ypwhich
> >    ps: bsearch for tgid failed: task: ffff880210ee6250 tgid: 7654
> >    crash>
> >
> > Without this patch, the search for the matching tgid would not generate
> > an error at all, but just quietly continue.
> >
> > The problem is due to the task.tgid may change on a live system, or more
> > likely, the task itself may have been re-used.
> >
> > I would like to fix it simply ignoring tgid bsearch failures on live
> > systems,
> > and just use the RSS stats stored in the per-tgid mm_struct.
> >
> > Does that work for you?
> >
> > Dave
> >
> >
> > .
> >
> ok!
> But I don't understand the meaning of "
> 
> fix it simply ignoring tgid bsearch failures on live systems,
> and just use the RSS stats stored in the per-tgid mm_struct.
> 
> ", if tgid may be changed, the tgid_array is useless on live systems.

Well, in this case, it may be true for a particular task if the task struct
had been re-used in between the time the arrays were created and the time
that the "ps" command gets around to reading and displaying its various
statistics.  And so the command may read invalid data w/respect to that task.

But let's be clear -- that kind of behavior is, and always has been, an 
unavoidable circumstance when running the crash utility on live systems, or
when looking at a "live" dump.

It's not just the "ps" command, but any command that displays data that
is subject to the "shifting sands" syndrome, where the kernel data is
constantly being modified while the crash command is running.  

So the idea is to not just cancel the whole command with an error(FATAL...)
if such an anomoly occurs on a live system.

> And what is the "RSS stats stored in the per-tgid mm_struct" used for?

Sorry -- I meant to quietly skip the checking of the other tasks in the
task group, and simply use whatever is stored in the mm_struct pointed to
by the original task.  Without your patch, if the tgid was not found, the
command would just continue.  With your patch applied, it would be OK 
do the error(FATAL) in the case of a static dumpfile.  But in the case of
a live system (or live dump), it's not worth killing the command at that
point. 

Clear?

Dave

> More clearly, please.
>    thanks,
>       Pan
> 

--
Crash-utility mailing list
Crash-utility@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/crash-utility




[Index of Archives]     [Fedora Development]     [Fedora Desktop]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]

 

Powered by Linux