Re: slow perfomance: sanity check

Adam Carheden <carheden@xxxxxxxx> · Thu, 6 Apr 2017 12:54:58 -0600

60-80MBs/s for what sort of setup? Is that 1Gbe rather than 10Gbe?

I consistently get 80-90Mb/s bandwidth as measured by `rados bench -p
rbd 10 write` run from a ceph node on a cluster with:
* 3 nodes
* 4 OSD/node, 600GB 15kRPM SAS disks
* 1G disk controller cache write cache shared by all disks in each node
* No SSDs
* 2x1Gbe lacp bond for redundancy, no jumbo frames
* 512 PGs for a cluster of 12 OSDs
* All disks in one pool of size=3, min_size=2

IOzone run on a VM using an rbd as it's HD confirms that setup maxes out
at around just under 100 MB/s for best-case scenarios, so I assumed the
1Gb network was the bottleneck.

I'm in the process of planning a hardware purchase for a larger cluster:
more nodes, more drives, SSD journals and 10Gbe. I'm asuming I'll get
better performance.

What's the upper bound on CEPH performance for large sequential writes
from a single-client with all the recommended bells and whistles (ssd
journal, 10Gbe)? I assume it depends on both the total number of OSDs
and possibly OSDs per node if one had enough to saturate the network,
correct?

-- 
Adam Carheden

On 04/06/2017 12:29 PM, Mark Nelson wrote:
> With filestore on XFS using SSD journals that have good O_DSYNC write
> performance, we typically see between 60-80MB/s per disk before
> replication for large object writes.  This is assuming there are no
> other bottlenecks or things going on though (pg splitting, recovery,
> network issues, etc).  Probably the best case scenario would be large
> writes to an RBD volume with 4MB objects and enough PGs in the pool that
> splits never need to happen.
> 
> Having said that, on setups where some of the drives are slow, the
> network is misconfigured, there are too few PGs, there are too many
> drives on one controller, or other issues, 25-30MB/s per disk is
> certainly possible.
> 
> Mark
> 
> On 04/06/2017 10:05 AM, Stanislav Kopp wrote:
>> I've reduced OSDs to 12 and  moved journal to ssd drives and now have
>> "boost" with writes to ~33-35MB/s. Is it maximum without full ssd
>> pools?
>>
>> Best,
>> Stan
>>
>> 2017-04-06 9:34 GMT+02:00 Stanislav Kopp <staskopp@xxxxxxxxx>:
>>> Hello,
>>>
>>> I'm evaluate ceph cluster, to see  if you can use it for our
>>> virtualization solution (proxmox). I'm using 3 nodes, running Ubuntu
>>> 16.04 with stock ceph (10.2.6), every OSD uses separate 8 TB spinning
>>> drive (XFS), MONITORs are installed on the same nodes, all nodes are
>>> connected via 10G switch.
>>>
>>> The problem is, on client I have only ~25-30 MB/s with seq. write. (dd
>>> with "oflag=direct"). Proxmox uses Firefly, which is old, I know.  But
>>> I have the same performance on my desktop running the same version as
>>> ceph nodes using rbd mount, iperf shows full speed (1GB or 10GB up to
>>> client).
>>> I know that this setup is not optimal and for production I will use
>>> separate MON nodes and ssd for OSDs, but was wondering is this
>>> performance still normal. This is my cluster status.
>>>
>>>      cluster 3ea55c7e-5829-46d0-b83a-92c6798bde55
>>>      health HEALTH_OK
>>>      monmap e5: 3 mons at
>>> {ceph01=10.1.8.31:6789/0,ceph02=10.1.8.32:6789/0,ceph03=10.1.8.33:6789/0}
>>>
>>>             election epoch 60, quorum 0,1,2 ceph01,ceph02,ceph03
>>>      osdmap e570: 42 osds: 42 up, 42 in
>>>             flags sortbitwise,require_jewel_osds
>>>       pgmap v14784: 1024 pgs, 1 pools, 23964 MB data, 6047 objects
>>>             74743 MB used, 305 TB / 305 TB avail
>>>                 1024 active+clean
>>>
>>> btw, bench on nodes itself looks good as far I see.
>>>
>>> ceph01:~# rados bench -p rbd 10 write
>>> ....
>>> Total time run:         10.159667
>>> Total writes made:      1018
>>> Write size:             4194304
>>> Object size:            4194304
>>> Bandwidth (MB/sec):     400.801
>>> Stddev Bandwidth:       38.2018
>>> Max bandwidth (MB/sec): 472
>>> Min bandwidth (MB/sec): 344
>>> Average IOPS:           100
>>> Stddev IOPS:            9
>>> Max IOPS:               118
>>> Min IOPS:               86
>>> Average Latency(s):     0.159395
>>> Stddev Latency(s):      0.110994
>>> Max latency(s):         1.1069
>>> Min latency(s):         0.0432668
>>>
>>>
>>> Thanks,
>>> Stan
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com