Re: jaeger tracing in the RGW

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



thanks, Deepika! will be there.

below is high level work breakdown (from RGW perspective):

(1) finalize: https://github.com/ceph/ceph/pull/42434 (without any additional scope)
(2) document the manual installation process as part of the user documentation
(3) multipart upload tracing: inject the parent span to the temporary head object, and extract it in any conquest put/complete
(4) consolidate the above PR with https://github.com/ceph/ceph/blob/master/src/common/tracer.h. and make the required changes to the OSD code
(5) librados: trickle the span down to librados, inject that as part of the protocol (similar to "blkin_trace_info"), and extract on the OSD side
(6) use step (5) to do CLS/RGW end2end tracing
(7) cephadm deployment
(8) rook deployment
(9) multisite (with multi-collector) deployment documentation
(10) multisite tracing


On Tue, Aug 3, 2021 at 11:39 PM Deepika Upadhyay <dupadhya@xxxxxxxxxx> wrote:
Hey Yuval,

These are great!

If anyone wants to have any discussion related to tracing, feel free to join CDM! : https://tracker.ceph.com/projects/ceph/wiki/CDM_04-AUG-2021 
11am‎ - ‎1pm‎ (Eastern Time - NewYork)
Date: Wed Aug 4, 2021
Where: https://bluejeans.com/908675367

Thanks,
Deepika


On Wed, Jul 28, 2021 at 12:56 PM Yuval Lifshitz <ylifshit@xxxxxxxxxx> wrote:
Dear Community,
Below are discussion points regarding the addition of jaeger tracing to the RGW [0].
Any feedback is welcome!

* multipart upload usecase:
- correlate the flow of multipart upload operations that may be spread across multiple RGWs (e.g. "put" of different parts on different RGWs)
- on each RGW, we would like to be able to follow the tracepoints of the operation from the frontend, via librados, down to the OSD
- we would like to be able to correlate the syncing of the object from the RGWs where the upload is done to RGWs on other zones
- this is probably the usecase that would require most tracing features and would have the most value

* deployment in case of multisite:
- agents should run per host, co-located with the RGWs
- collectors can be per cluster
- if we have multiple clusters, we should probably follow the "Kafka as intermediate buffer" architecture from here [1]. Having multiple collectors send the spans/traces to a centralized location
- 1st deployment option would be manual, for which we would provide only the documentation
- 2nd deployment option would be in the case of k8s [2] and OpenShift [3]. more investigation is needed to figure out how to support centralized DB location if different ceph clusters are in different k8s clusters
- 3rd option would be using cephadm. some work started by the OSD team [4]. but this probably won't cover the multisite case

* logs in traces:
- we should probably avoid unstructured string based logs and should mainly use the trace/span names, together with tags to convey the information in the trace
- e.g. use error codes as tags, instead of error messages in logs
- in the future, we may add structured or dictionary based logs. note that string copy would still be needed for the traces unless we modify the underlying jaeger code

* inject/extract:
- for multisite sync tracing between RGWs the trace should be added to the bucket-index log. injected when the object is created and extracted by the RGW that sync itself
- adding that to the sync REST (HTTP) API is probably less useful
- would need to piggyback the trace to the RADOS protocol, inject the span inside librados and extract it in the OSD for the rest of the tracing. this could be done similarly to the work done for blkin tracing [5]. need to check if we can use the same API, or need to add a new one

* ops context:
- we should probably add host_id from req_state as a tag to identify the RGW that emitted the trace
- for multipart upload, we should add the "upload_id" as a tag so that traces that start on different RGWs could be correlated
- we should use return codes as tags, to indicate the success/failure reason of the operation. this could be done at the base level

* locks and thread_local tracers:
- the goal here is to avoid contention on the locks used when a "Finish()" is called on a span - which sends the data to the agent
- AFAIK, in our threads/coroutine model, a different thread may resume a coroutine that started on another thread. this means that the span would use a different tracer to do the sending than the one that was used to create it. need to make sure that it works.

* conditional tracing:
- this was brought up as an important usability issue for tracing. but not discussed further. we should set up a different discussion for that topic
- current code would allow dynamic enabling/disabling of tracing

Yuval



_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx

[Index of Archives]     [CEPH Users]     [Ceph Devel]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux