On Thu, Sep 10, 2015 at 1:02 PM, Deneau, Tom <tom.deneau@xxxxxxx> wrote: > Running 9.0.3 rados bench on a 9.0.3 cluster... > In the following experiments this cluster is only 2 osd nodes, 6 osds each > and a separate mon node (and a separate client running rados bench). > > I have two pools populated with 4M objects. The pools are replicated x2 > with identical parameters. The objects appear to be spread evenly across the 12 osds. > > In all cases I drop caches on all nodes before doing a rados bench seq test. > In all cases I run rados bench seq for identical times (30 seconds) and in that time > we do not run out of objects to read from the pool. > > I am seeing significant bandwidth differences between the following: > > * running a single instance of rados bench reading from one pool with 32 threads > (bandwidth approx 300) > > * running two instances rados bench each reading from one of the two pools > with 16 threads per instance (combined bandwidth approx. 450) > > I have already increased the following: > objecter_inflight_op_bytes = 104857600000 > objecter_inflight_ops = 8192 > ms_dispatch_throttle_bytes = 1048576000 #didn't seem to have any effect > > The disks and network are not reaching anywhere near 100% utilization > > What is the best way to diagnose what is throttling things in the one-instance case? Pretty sure the rados bench main threads are just running into their limits. There's some work that Piotr (I think?) has been doing to make it more efficient if you want to browse the PRs, but I don't think they're even in a dev release yet. -Greg _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com