On Thu, May 28, 2015 at 4:50 PM, Deneau, Tom <tom.deneau@xxxxxxx> wrote: > > >> -----Original Message----- >> From: Gregory Farnum [mailto:greg@xxxxxxxxxxx] >> Sent: Thursday, May 28, 2015 6:18 PM >> To: Deneau, Tom >> Cc: ceph-devel >> Subject: Re: rados bench throughput with no disk or network activity >> >> On Thu, May 28, 2015 at 4:09 PM, Deneau, Tom <tom.deneau@xxxxxxx> wrote: >> > I've noticed that >> > * with a single node cluster with 4 osds >> > * and running rados bench rand on that same node so no network traffic >> > * with a number of objects small enough so that everything is in >> > the cache so no disk traffic >> > >> > we still peak out at about 1600 MB/sec. >> > >> > And the cpu is 40% idle. (and a good chunk of the cpu activity is the >> > rados benchmark itself) >> > >> > What is likely causing the throttling here? >> >> Well, rados bench itself is essentially single-threaded, so if it's using >> 100% CPU that's probably the bottleneck you're hitting. >> >> Otherwise, by default it will limit itself to 100MB of outstanding IO >> (there's an objecter config value you can change for this; it's been >> discussed recently) and that might not be enough given the latencies of >> hopping packets across different CPUs, and the OSDs have a slightly- >> embarrassing amount of CPU computation and thread hopping they have to >> perform on every op (around half a millisecond's worth on each read, I >> think?). >> -Gerg > > Right. I was involved in the objecter config discussion :) and > I have set the limits higher. And this 1600 MB/sec limit seems to be > the same whatever the size of the objects. > > rados bench is using about 30% of the cpu and the total cpu usage is about 60% > (the rest being mostly from the 4 osds). > > Hmm, I just tried running 4 copies of rados bench rand, and I can get a little bit > higher combined totals, but not much higher maybe 1800 MB/sec. You might also just be approaching the limits of your hardware's effective memory bandwidth in a configuration like that — that seems low to me but between shuffling data back and forth on a bunch of sockets and things adds up. I don't know if there's a good way to measure that. -Greg -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html