Re: [PATCH v2 0/7] Add histogram measuring mmap_lock contention latency

Masami Hiramatsu <mhiramat@xxxxxxxxxx> · Sat, 30 May 2020 00:03:59 +0900

On Fri, 29 May 2020 10:09:57 +0200
Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> On Thu, May 28, 2020 at 06:39:08PM -0700, Axel Rasmussen wrote:
> 
> > The use case we have in mind for this is to enable this instrumentation
> > widely in Google's production fleet. Internally, we have a userspace thing
> > which scrapes these metrics and publishes them such that we can look at
> > aggregate metrics across our fleet. The thinking is that mechanisms like
> > lockdep or getting histograms with e.g. BPF attached to the tracepoint
> > introduces too much overhead for this to be viable. (Although, granted, I
> > don't have benchmarks to prove this - if there's skepticism, I can produce
> > such a thing - or prove myself wrong and rethink my approach. :) )
> 
> Whichever way around; I don't believe in special instrumentation like
> this. We'll grow a thousand separate pieces of crap if we go this route.
> 
> Next on, someone will come and instrument yet another lock, with yet
> another 1000 lines of gunk.
> 
> Why can't you kprobe the mmap_lock things and use ftrace histograms?

+1.
As far as I can see the series, if you want to make a histogram
of the duration of acquiring locks, you might only need 7/7 (but this
is a minimum subset.) I recommend you to introduce a set of tracepoints
 -- start-locking, locked, and released so that we can investigate
which process is waiting for which one. Then you can use either bpf
or ftrace to make a histogram easily.

Thank you,

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>