Re: [PATCH RFC 4/5] block: add a statistic table for io latency

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/10/20 12:00 PM, Ming Lei wrote:
On Fri, Jul 10, 2020 at 10:55:24AM +0200, Guoqing Jiang wrote:
Hi Ming,

On 7/10/20 2:53 AM, Ming Lei wrote:
Hi Guoqing,

On Thu, Jul 09, 2020 at 08:48:08PM +0200, Guoqing Jiang wrote:
Hi Ming,

On 7/8/20 4:06 PM, Guoqing Jiang wrote:
On 7/8/20 4:02 PM, Guoqing Jiang wrote:
Hi Guoqing,

I believe it isn't hard to write a ebpf based script(bcc or
bpftrace) to
collect this kind of performance data, so looks not necessary to do it
in kernel.
Hi Ming,

Sorry, I don't know well about bcc or bpftrace, but I assume they
need to
read the latency value from somewhere inside kernel. Could you point
how can I get the latency value? Thanks in advance!
Hmm, I suppose biolatency is suitable for track latency, will look into
it.
I think biolatency can't trace data if it is not running,
Yeah, the ebpf prog is only injected when the trace is started.

also seems no
place
inside kernel have recorded such information for ebpf to read, correct me
if my understanding is wrong.
Just record the info by starting the bcc script in case you need that, is there
anything wrong with this usage? Always doing such stuff in kernel isn't fair for
users which don't care or need this info.
That is why we add a Kconfig option and set it to N by default. And I
suppose
with modern cpu, the cost with several more instructions would not be that
expensive even the option is enabled, just my $0.02.

And as cloud provider,we would like to know data when necessary instead
of collect data by keep script running because it is expensive than just
read
node IMHO.
It shouldn't be expensive. It might be a bit slow to inject the ebpf prog because
the code has to be verified, however once it is put inside kernel, it should have
been efficient enough. The kernel side prog only updates & stores the latency
summery data into bpf map, and the stored summery data can be read out anytime
by userspace.

Could you explain a bit why it is expensive? such as biolatency
I thought I am compare read a sys node + extra instructions in kernel with
launch a specific process for monitoring which need to occupy more
resources (memory) and context switch. And for biolatency, it calls the
bpf_ktime_get_ns to calculate latency for each IO which I assume the
ktime_get_ns will be triggered finally, and it is not cheap as you said.
You can replace one read of timestamp with rq->start_time_ns too, just
like what this patch does. You can write your bcc/bfptrace script,
which is quite easy to start. Once you learn its power, maybe you will love
it.

Yes, I definitely need to learn more about it :-). But even with the change,
I still believe read a node is cheaper than a script.

And seems biolatency can't trace bio based driver per below, and with
collect data in tree we can trace all block drivers.

# load BPF program
b = BPF(text=bpf_text)
if args.queued:
    b.attach_kprobe(event="blk_account_io_start", fn_name="trace_req_start")
else:
    b.attach_kprobe(event="blk_start_request", fn_name="trace_req_start")
    b.attach_kprobe(event="blk_mq_start_request", fn_name="trace_req_start")
b.attach_kprobe(event="blk_account_io_completion",
    fn_name="trace_req_completion")

Could it possible to extend it support trace both request and bio? Otherwise
we have to run another script to trace md raid.

Thanks,
Guoqing



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux