Re: Librbd logging

Jason Dillaman <jdillama@xxxxxxxxxx> · Tue, 4 Apr 2017 10:06:01 -0400

Couple options:

1) you can enable LTTng-UST tracing [1][2] against your VM for an
extremely light-weight way to track IO latencies.
2) you can enable "debug rbd = 20" and grep through the logs for
matching "AioCompletion.*(set_request_count|finalize)" log entries
3) use the asok file during one of these events to dump the objecter requests

[1] http://docs.ceph.com/docs/jewel/rbd/rbd-replay/
[2] http://tracker.ceph.com/issues/14629

On Tue, Apr 4, 2017 at 7:36 AM, Laszlo Budai <laszlo@xxxxxxxxxxxxxxxx> wrote:
> Hello cephers,
>
> I have a situation where from time to time the write operation to the seph
> storage hangs for 3-5 seconds. For testing we have a simple line like:
> while sleep 1; date >> logfile; done &
>
> with this we can see that rarely there are 3 seconds or more differences
> between the consecutive outputs of date.
> Initially we have suspected the deep scrub and we have tuned its parameters,
> so right now I'm confident that the reason is something different than the
> deep scrubbing.
>
> I would like to know if any of you has encountered a similar situation, and
> what was the solution for it.
> I am suspecting the network between the compute nodes and the storage, but I
> need to prove this. I am thinking on enabling client side logging for
> librbd, but I see there are many subsystems where the logging can be
> enabled. Can anyone tell me which subsystem should I log, and at which level
> to be able to see whether the network is causing write issues?
> We're using ceph 0.94.10.
>
> Thank you,
> Laszlo
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Jason
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com