Re: There is no rbd_aio_write_traced function in librbd

Mohamad Gebai <mgebai@xxxxxxx> · Mon, 13 Nov 2017 10:03:34 -0500

+ceph-devel

Hi,

Do you want to run LTTng on production or on a vstart cluster? If you
want to start with a vstart cluster, make sure to have a recent version
of Ceph as some fixes for tracing were introduced "recently". That blog
post is kind of old, so I'll  reiterate the steps here as there were
changes since it was written. With LTTng (not Blkin), it should be
straightforward:

- get a recent version of Ceph source code
- make sure to have WITH_LTTNG=ON (and optionally WITH_BABELTRACE=ON) in
your cmake command
- build Ceph
- add osd_tracing = true to ceph.conf
- launch the vstart cluster without LD_PRELOAD or any other "unusual"
setting
    + LD_PRELOAD=lttng-ust-fork isn't required anymore when tracing Ceph
with LTTng (not Blkin)
- "lttng list -u" should list the available tracepoints

As for Blkin, I'm not sure about its current state. Last time I tried, I
wasn't able to get it to work, and I ran into issues at different layers
(Ceph, LTTng, python libraries, Zipkin server). If you have a recent
version of Ceph, tracing should work (but not analyzing the trace using
Zipkin, at least for me). These are the steps:

- get a recent version of Ceph source code
- make sure to have WITH_BLKIN=ON in your cmake command
- build Ceph
- add osd_blkin_trace_all=true and other *blkin_trace*=true options as
needed
- launch the vstart cluster without LD_PRELOAD
    + linking against lttng-ust-fork *is* required for Blkin, but it was
added by default recently
- "lttng list -u" should list the available tracepoints

If you want to use LTTng or Blkin on an already deployed cluster, it
might be trickier. The first step would be to know which version of Ceph
you're using.

Mohamad

On 11/13/2017 09:23 AM, 李逸超(基础平台部) wrote:
>
>
> "After I create a new ceph.client.conf” means a ceph.conf without
`osd_tracing=true`.
>
> 在 2017年11月13日 下午10:20:55, 李逸超(基础平台部)
(liyichao@xxxxxxxxxxxxxxx) 写到:
>>
>> Hi，I tried two things: ceph with lttng and ceph with blkin. I wil
describe the steps and problems I encountered.
>> ceph with lttng
>>
>> There is no doc in ceph, so I basically follow this blog. Problems are:
>>
>>     No trace exists when lttng list -u. Only after some search and
looking at the source did I realize I have to put osd_tracing etc in the
ceph.conf
>>
>>     I have to run with
LD_PRELOAD=/usr/lib64/liblttng-ust-fork.so.0.0.0 ../vstart.sh …, so, is
it possible to do not do this? Then we can start tracing production osd
more easily.
>>
>>     after I set osd_tracing = true and restart the cluster.
./bin/ceph -c ceph.conf osd pool create test4 8 fails with Error EDQUOT:
crush test failed with -122 and with some experience with lttng
architecture and the source code of ./bin/ceph, I guess ./bin/ceph will
link librados but may not do dlopen because it is not osd, so dynamic
linkage with lttng is not working. After I create a new ceph.client.conf
and use ./bin/ceph -c ceph.client.conf, it succeed.
>>
>> ceph with blkin
>>
>> I follow Victor Araujo’s blog。
>>
>>     When I run LD_PRELOAD=/usr/lib64/liblttng-ust-fork.so.0.0.0
../src/vstart.sh -d -k -x -b, I stuck at:
>>
>> …
>> /data/apps/ceph/build-blkin/bin/ceph-mgr -i x -c
/data/apps/ceph/build-blkin/ceph.conf
>> /data/apps/ceph/build-blkin/bin/ceph -c
/data/apps/ceph/build-blkin/ceph.conf -k
/data/apps/ceph/build-blkin/keyring tell mgr restful create-self-signed-cert
>>
>> It is because mgr can not start:
>>    -1> 2017-11-13 22:18:35.647409 7f21df74e700 20 mgr Gil GIL
acquired for thread state 0x7f21f8cae420
>>      0> 2017-11-13 22:18:35.762103 7f21df74e700 -1 *** Caught signal
(Aborted) **
>>  in thread 7f21df74e700 thread_name:mgr-fin
>>
>>  ceph version 12.2.0 (32ce2a3ae5239ee33d6150705cdb24d43bab910c)
luminous (rc)
>>  1: (()+0x3cf501) [0x7f21ee8c1501]
>>  2: (()+0xf100) [0x7f21eb9c9100]
>>  3: (gsignal()+0x37) [0x7f21ea9e65f7]
>>  4: (abort()+0x148) [0x7f21ea9e7ce8]
>>  5: (()+0x2729fb) [0x7f21c9bf69fb]
>>  6: (()+0xf3a3) [0x7f21ee2dd3a3]
>>  7: (()+0x13ab6) [0x7f21ee2e1ab6]
>>  8: (()+0xf1b4) [0x7f21ee2dd1b4]
>>  9: (()+0x131ab) [0x7f21ee2e11ab]
>>  10: (()+0x102b) [0x7f21ed4ba02b]
>>  11: (()+0xf1b4) [0x7f21ee2dd1b4]
>>  12: (()+0x162d) [0x7f21ed4ba62d]
>>  13: (dlopen()+0x31) [0x7f21ed4ba0c1]
>>
>> So I start it manually with /data/apps/ceph/build-blkin/bin/ceph-mgr
-i x -c /data/apps/ceph/build-blkin/ceph.conf, and the traces are
successfully shown.
>>
>>
>>
>>
>> 在 2017年11月9日 上午5:25:11, Ali Maredia (amaredia@xxxxxxxxxx) 写到:
>>> Mohamad

On 11/08/2017 04:25 PM, Ali Maredia wrote:
> Hello,
>
> Could you give me more details about what you are trying
> to do? Maybe all the steps for it. Are you trying to trace
> just the RBD?
>
> Some of the work Victor did a couple summers ago was going
> through the RBD code and put tracepoints in t.
>
> I understand those 2 PRs were not merged but related work
> was I think.
>
> I added Mohamad to this email who was working on tracing
> stuff recently.
>
> Best,
>
> Ali
>
> ----- Original Message -----
>> From: "李逸超(基础平台部)" <liyichao@xxxxxxxxxxxxxxx>
>> To: amaredia@xxxxxxxxxx
>> Sent: Wednesday, November 8, 2017 9:45:31 AM
>> Subject: There is no rbd_aio_write_traced function in librbd
>>
>> Recently I am researching ceph tracing, and find this link
>> http://victoraraujo.me/babeltrace-zipkin/ ,following the steps it
provides,
>> I fail to build fio with rbd blkin. The error is: `undefined reference to
>> `rbd_aio_write_traced’`, and I search ceph source code, there indeed
is not
>> this function,
>>
https://github.com/ceph/ceph/search?q=rbd_aio_write_traced&type=Issues&utf8=%E2%9C%93
>> 。there is only two issues which are not merged.
>>
>

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html