After running some other experiments, I see now that the high single-node bandwidth only occurs when ceph-mon is also running on that same node. (In these small clusters I only had one ceph-mon running). If I compare to a single-node where ceph-mon is not running, I see basically identical performance to the two-node arrangement. So now my question is: Is it expected that there would be such a large performance difference between using osds on a single node where ceph-mon is running vs. using osds on a single node where ceph-mon is not running? -- Tom > -----Original Message----- > From: Deneau, Tom > Sent: Thursday, September 03, 2015 10:39 AM > To: 'Christian Balzer'; ceph-users > Subject: RE: osds on 2 nodes vs. on one node > > Rewording to remove confusion... > > Config 1: set up a cluster with 1 node with 6 OSDs Config 2: identical > hardware, set up a cluster with 2 nodes with 3 OSDs each > > In each case I do the following: > 1) rados bench write --no-cleanup the same number of 4M size objects > 2) drop caches on all osd nodes > 3) rados bench seq -t 4 to sequentially read the objects > and record the read bandwidth > > Rados bench is running on a separate client, not on an OSD node. > The client has plenty of spare CPU power and the network and disk utilization > are not limiting factors. > > With Config 1, I see approximately 70% more sequential read bandwidth than > with Config 2. > > In both cases the primary OSDs of the objecgts appear evenly distributed > across OSDs. > > Yes, replication factor is 2 but since we are only measuring read > performance, I don't think that matters. > > Question is whether there is a ceph parameter that might be throttling the > 2 node configuation? > > -- Tom > > > -----Original Message----- > > From: Christian Balzer [mailto:chibi@xxxxxxx] > > Sent: Wednesday, September 02, 2015 7:29 PM > > To: ceph-users > > Cc: Deneau, Tom > > Subject: Re: osds on 2 nodes vs. on one node > > > > > > Hello, > > > > On Wed, 2 Sep 2015 22:38:12 +0000 Deneau, Tom wrote: > > > > > In a small cluster I have 2 OSD nodes with identical hardware, each > > > with > > > 6 osds. > > > > > > * Configuration 1: I shut down the osds on one node so I am using 6 > > > OSDS on a single node > > > > > Shut down how? > > Just a "service blah stop" or actually removing them from the cluster > > aka CRUSH map? > > > > > * Configuration 2: I shut down 3 osds on each node so now I have 6 > > > total OSDS but 3 on each node. > > > > > Same as above. > > And in this case even more relevant, because just shutting down random > > OSDs on both nodes would result in massive recovery action at best and > > more likely a broken cluster. > > > > > I measure read performance using rados bench from a separate client node. > > Default parameters? > > > > > The client has plenty of spare CPU power and the network and disk > > > utilization are not limiting factors. In all cases, the pool type is > > > replicated so we're just reading from the primary. > > > > > Replicated as in size 2? > > We can guess/assume that from your cluster size, but w/o you telling > > us or giving us all the various config/crush outputs that is only a guess. > > > > > With Configuration 1, I see approximately 70% more bandwidth than > > > with configuration 2. > > > > Never mind that bandwidth is mostly irrelevant in real life, which > > bandwidth, read or write? > > > > > In general, any configuration where the osds span 2 nodes gets > > > poorer performance but in particular when the 2 nodes have equal > > > amounts of traffic. > > > > > > > Again, guessing from what you're actually doing this isn't particular > > surprising. > > Because with a single node, default rules and replication of 2 your > > OSDs never have to replicate anything when it comes to writes. > > Whereas with 2 nodes replication happens and takes more time (latency) > > and might also saturate your network (we have of course no idea how > > your cluster looks like). > > > > Christian > > > > > Is there any ceph parameter that might be throttling the cases where > > > osds span 2 nodes? > > > > > > -- Tom Deneau, AMD > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > > -- > > Christian Balzer Network/Systems Engineer > > chibi@xxxxxxx Global OnLine Japan/Fusion Communications > > http://www.gol.com/ _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com