On 08/06/2018 06:24 PM, John Spray wrote: > On Mon, Aug 6, 2018 at 5:04 PM Wido den Hollander <wido@xxxxxxxx> wrote: >> >> Hi, >> >> I'm busy with a customer trying to speed up the Influx and Telegraf >> module to gather statistics of their cluster with 2.000 OSDs. >> >> The problem I'm running into is the performance of the Influx module, >> but this seems to boil down to the Mgr daemon. >> >> Gathering and sending all statistics of the cluster takes about 35 >> seconds with the current code of the Influx module. >> >> By using iterators, queues and multi-threading I was able to bring this >> down to ~20 seconds, but the main problem is this piece of code: >> >> for daemon, counters in six.iteritems(self.get_all_perf_counters()): >> svc_type, svc_id = daemon.split(".", 1) >> metadata = self.get_metadata(svc_type, svc_id) >> >> for path, counter_info in counters.items(): >> if counter_info['type'] & self.PERFCOUNTER_HISTOGRAM: >> continue >> >> Gathering all the performance counters and metadata of these 2.000 >> daemons brings to grant total to about 95k data points. >> >> Influx flushes this within just a few seconds, but it takes the Mgr >> daemon a lot more time to spit them out. >> >> I also see that ceph-mgr daemon starts to use a lot of CPU when going >> through this. >> >> The Telegraf module also suffers from this as it uses the same code path >> to fetch these counters. >> >> Is there anything we can do better inside the modules? Or something to >> be improved inside the Mgr? > > There's definitely room to make get_all_perf_counters *much* more > efficient. It's currently issuing individual get_counter() calls into > C++ land for every counter, and get_counter is returning the last N > values into python before get_latest throws away all but the latest. > Ah, yes. It seemed like that when going through the Python code. > I'd suggest implementing a C++ version of get_all_perf_counters. > There will always be some ceiling on how much data is practical in the > "one big endpoint" approach to gathering stats, but if we have a > potential order of magnitude improvement in this call then we should > do it. > Yes, I'm aware. I would like to stay away from polling the daemons locally on each host. There will always be a upper limit, that's normal, but I do think the Mgr should be able to handle a couple of thousands OSDs. Wido > John > >> >> Wido >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html