Mark, On what version did you run the tests? Orit On Mon, Feb 6, 2017 at 7:07 PM, Mark Nelson <mnelson@xxxxxxxxxx> wrote: > > > On 02/06/2017 11:02 AM, Orit Wasserman wrote: >> >> On Mon, Feb 6, 2017 at 5:44 PM, Matt Benjamin <mbenjamin@xxxxxxxxxx> >> wrote: >>> >>> Keep in mind, RGW does most of its request processing work in civetweb >>> threads, so high utilization there does not necessarily imply >>> civetweb-internal processing. >>> >> >> True but the request processing is not a CPU intensive operation. >> It does seems to indicate that the civetweb threading model simply >> doesn't scale (we already noticed it already) or maybe it can point to >> some locking issue. We need to run a profiler to understand what is >> consuming CPU. >> It maybe a simple fix until we move to asynchronous frontend. >> It worth investigating as the CPU usage mark is seeing is really high. > > > The initial profiling I did definitely showed a lot of tcmalloc threading > activity, which diminshed after increasing threadcache. This is quite > similar to what we saw in simplemessenger with low threadcache values, > though likely is less true with async messenger. Sadly a profiler like perf > probably isn't going to help much with debugging lock contention. grabbing > GDB stack traces might help, or lttng. > >> >> Mark, >> How many concurrent request were handled? > > > Most of the tests had 128 concurrent IOs per radosgw daemon. The max thread > count was increased to 512. It was very obvious when exceeding the thread > count since some getput processes will end up stalling and doing their > writes after others, leading to bogus performance data. > > >> >> Orit >> >>> Matt >>> >>> ----- Original Message ----- >>>> >>>> From: "Mark Nelson" <mnelson@xxxxxxxxxx> >>>> To: "Matt Benjamin" <mbenjamin@xxxxxxxxxx> >>>> Cc: "ceph-devel" <ceph-devel@xxxxxxxxxxxxxxx>, cbt@xxxxxxxxxxxxxx, "Mark >>>> Seger" <mjseger@xxxxxxxxx>, "Kyle Bader" >>>> <kbader@xxxxxxxxxx>, "Karan Singh" <karan@xxxxxxxxxx>, "Brent Compton" >>>> <bcompton@xxxxxxxxxx> >>>> Sent: Monday, February 6, 2017 10:42:04 AM >>>> Subject: Re: CBT: New RGW getput benchmark and testing diary >>>> >>>> Just based on what I saw during these tests, it looks to me like a lot >>>> more time was spent dealing with civetweb's threads than RGW. I didn't >>>> look too closely, but it may be worth looking at whether there's any low >>>> hanging fruit in civetweb itself. >>>> >>>> Mark >>>> >>>> On 02/06/2017 09:33 AM, Matt Benjamin wrote: >>>>> >>>>> Thanks for the detailed effort and analysis, Mark. >>>>> >>>>> As we get closer to the L time-frame, it should become relevant to look >>>>> at >>>>> the relative boost::asio frontend rework i/o paths, which are the open >>>>> effort to reduce CPU overhead/revise threading model, in general. >>>>> >>>>> Matt >>>>> >>>>> ----- Original Message ----- >>>>>> >>>>>> From: "Mark Nelson" <mnelson@xxxxxxxxxx> >>>>>> To: "ceph-devel" <ceph-devel@xxxxxxxxxxxxxxx>, cbt@xxxxxxxxxxxxxx >>>>>> Cc: "Mark Seger" <mjseger@xxxxxxxxx>, "Kyle Bader" >>>>>> <kbader@xxxxxxxxxx>, >>>>>> "Karan Singh" <karan@xxxxxxxxxx>, "Brent >>>>>> Compton" <bcompton@xxxxxxxxxx> >>>>>> Sent: Monday, February 6, 2017 12:55:20 AM >>>>>> Subject: CBT: New RGW getput benchmark and testing diary >>>>>> >>>>>> Hi All, >>>>>> >>>>>> Over the weekend I took a stab at improving our ability to run RGW >>>>>> performance tests in CBT. Previously the only way to do this was to >>>>>> use >>>>>> the cosbench plugin, which required a fair amount of additional >>>>>> setup and while quite powerful can be overkill in situations where you >>>>>> want to rapidly iterate over tests looking for specific issues. A >>>>>> while >>>>>> ago Mark Seger from HP told me he had created a swift benchmark called >>>>>> "getput" that is written in python and is much more convenient to run >>>>>> quickly in an automated fashion. Normally getput is used in >>>>>> conjunction >>>>>> with gpsuite, a tool for coordinating benchmarking multiple getput >>>>>> processes. This is how you would likely use getput on a typical ceph >>>>>> or >>>>>> swift cluster, but since CBT builds the cluster and has it's own way >>>>>> for >>>>>> launching multiple benchmark processes, it uses getput directly. >>>>>> >>>>> >>>>> >>>> >>> >>> -- >>> Matt Benjamin >>> Red Hat, Inc. >>> 315 West Huron Street, Suite 140A >>> Ann Arbor, Michigan 48103 >>> >>> http://www.redhat.com/en/technologies/storage >>> >>> tel. 734-821-5101 >>> fax. 734-769-8938 >>> cel. 734-216-5309 >>> -- >>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html