On 2018-06-29 18:30, Matthew Stroud wrote:
We back some of our ceph clusters with SAN SSD disk, particularly VSP G/F and Purestorage. I'm curious what are some settings we should look into modifying to take advantage of our SAN arrays. We had to manually set the class for the luns to SSD class which was a big improvement. However we still see situations where we get slow requests and the underlying disks and network are underutilized.
More info about our setup. We are running centos 7 with Luminous as our ceph release. We have 4 osd nodes that have 5x2TB disks each and they are setup as bluestore. Our ceph.conf is attached with some information removed for security reasons.
Thanks ahead of time.
Thanks,
Matthew Stroud
CONFIDENTIALITY NOTICE: This message is intended only for the use and review of the individual or entity to which it is addressed and may contain information that is privileged and confidential. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message solely to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify sender immediately by telephone or return email. Thank you.
If i understand correctly, you are using luns (via iSCSI) from your external SAN as OSDs and created a separate pool with these OSDs with device class SSD, you are using this pool for backup.
Some comments:
- Using external disks as OSDs is probably not that common. It may be better to keep the SAN and Ceph cluster separate and have your backup tool access both, it will also be safer in case of disaster to the cluster your backup will be on a separate system.
- What backup tool/script are you using ? it is better that this tool uses high queue depth, large block sizes and memory/page cache to increase performance during copies.
- To try to pin down where your current bottleneck is, i would run benchmarks (eg fio) using the block sizes used by your backup tool on the raw luns before being added as OSDs (as pure iSCSI disks) as well as on both the main and backup pools. Have a resource tool (eg atop/systat/collectl) run during these tests to check for resources: disks %busy/cores %busy/io_wait
- You probably can use replica count of 1 for the SAN OSDs since they include their own RAID redundancy.
Maged
|
Attachment:
ceph.conf
Description: Binary data
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com