Anyone can give me some advice?
---------- Forwarded message ----------
From: <xizhiyong18@xxxxxxxxx>
Date: 2016-04-26 18:50 GMT+08:00
Subject: google perftools on ceph-osd
To: Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx>
From: <xizhiyong18@xxxxxxxxx>
Date: 2016-04-26 18:50 GMT+08:00
Subject: google perftools on ceph-osd
To: Stefan Priebe - Profihost AG <s.priebe@xxxxxxxxxxxx>
hi Stefan:
When We are using ceph, I found osd process use much more CPU, especially when small rand write. So I want to do some analysis to find the slow point or bottleneck things.First I used the perf (record/report), and found the memery alloc and free of tcmalloc use more CPU, but there is no more info I want about the ceph-osd itself.
I did some search and found this 'extreme ceph-osd cpu load for rand. 4k write' (http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/10315) reported by you some years ago.In the mails you had atttached some files gen by google-perftools, and got some detail about the ceph-osd.
Below is how I did.
First I add some code in ceph_osd.cc, to add a signal handler(code like below),then I start a I/O bench on ceph, and wait the CPU usage of ceph-osd get high, the send a SIGUSR1 to the process(kill -s SIGUSR1 pid),later send a SIGUSR2 signal(kill -s SIGUSR2 pid), now I get the perf file, but when I use pprof tools to do some analysis, I found there is no perf data.(Total: 0 samples).
Can your share me more detail how did you do with google-perftools on ceph-osd which is a daemon process.
-------------------------------code---------------------------------
diff --git a/src/ceph_osd.cc b/src/ceph_osd.cc
index a2f4542..1ed654e 100644
--- a/src/ceph_osd.cc
+++ b/src/ceph_osd.cc
@@ -12,6 +12,7 @@
*
*/
+#include <google/profiler.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
@@ -83,6 +84,20 @@ int preload_erasure_code()
return r;
}
+void gprof_callback(int signum)
+{
+ if (signum == SIGUSR1)
+ {
+ dout(0) << "Catch the signal ProfilerStart\n" << dendl;
+ ProfilerStart("/tmp/bs.prof");
+ }
+ else if (signum == SIGUSR2)
+ {
+ dout(0) << "Catch the signal ProfilerStop\n" << dendl;
+ ProfilerStop();
+ }
+}
+
int main(int argc, const char **argv)
{
vector<const char*> args;
@@ -511,6 +526,11 @@ int main(int argc, const char **argv)
// install signal handlers
init_async_signal_handler();
register_async_signal_handler(SIGHUP, sighup_handler);
+
+ register_async_signal_handler(SIGUSR1, gprof_callback);
+
+ register_async_signal_handler(SIGUSR2, gprof_callback);
+
register_async_signal_handler_oneshot(SIGINT, handle_osd_signal);
register_async_signal_handler_oneshot(SIGTERM, handle_osd_signal);
--------------------------------------------------
----------------some log--------------------
2016-04-26 17:49:28.836289 7fd653475700 0 Catch the signal ProfilerStart
2016-04-26 17:52:44.877696 7fd653475700 0 Catch the signal ProfilerStop
----------------------------------------------------------------------------------------------------------------
regards~
Zhiyong Xi
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com