Hello Chris,
We don't use SSD as journal.(1) osd_mkfs_type = xfs
osd_mkfs_options_xfs = -f
filestore_xattr_use_omap = true
(2) filestore_queue_max_ops = 25000
filestore_queue_max_bytes = 10485760
filestore_queue_committing_max_ops = 5000
filestore_queue_committing_max_bytes = 10485760000
journal_max_write_bytes = 1073714824
journal_max_write_entries = 10000
journal_queue_max_ops = 50000
journal_queue_max_bytes = 10485760000
(3) osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier"
2016-05-10 15:31 GMT+08:00 Christian Balzer <chibi@xxxxxxx>:
On Tue, 10 May 2016 11:48:07 +0800 Geocast wrote:
Hello,
> We have 21 hosts for ceph OSD servers, each host has 12 SATA disks (4TB
> each), 64GB memory.
No journal SSDs?
What CPU(s) and network?
> ceph version 10.2.0, Ubuntu 16.04 LTS
> The whole cluster is new installed.
>
> Can you help check what the arguments we put in ceph.conf is reasonable
> or not?
> thanks.
>
> [osd]
> osd_data = /var/lib/ceph/osd/ceph-$id
> osd_journal_size = 20000
Overkill most likely, but not an issue.
> osd_mkfs_type = xfs
> osd_mkfs_options_xfs = -f
> filestore_xattr_use_omap = true
> filestore_min_sync_interval = 10
Are you aware what this does and have you actually tested this (IOPS AND
throughput) with various other setting on your hardware to arrive at this
number?
> filestore_max_sync_interval = 15
That's fine in and by itself, unlikely to ever be reached anyway.
> filestore_queue_max_ops = 25000
> filestore_queue_max_bytes = 10485760
> filestore_queue_committing_max_ops = 5000
> filestore_queue_committing_max_bytes = 10485760000
> journal_max_write_bytes = 1073714824
> journal_max_write_entries = 10000
> journal_queue_max_ops = 50000
> journal_queue_max_bytes = 10485760000
Same as above, have you tested these setting (from filestore_queue_max_ops
onward) compared to the defaults?
With HDDs only I'd expect any benefits to be small and/or things to become
very uneven once the HDDs are saturated.
> osd_max_write_size = 512
> osd_client_message_size_cap = 2147483648
> osd_deep_scrub_stride = 131072
> osd_op_threads = 8
> osd_disk_threads = 4
> osd_map_cache_size = 1024
> osd_map_cache_bl_size = 128
> osd_mount_options_xfs = "rw,noexec,nodev,noatime,nodiratime,nobarrier"
The nobarrier part is a a potential recipe for disaster unless you have all
on-disk caches disabled and every other cache battery backed.
The only devices I trust to mount nobarrier are SSDs with powercaps that
have been proven to do the right thing (Intel DC S amongst them).
> osd_recovery_op_priority = 4
> osd_recovery_max_active = 10
> osd_max_backfills = 4
>
That's sane enough.
> [client]
> rbd_cache = true
AFAIK that's the case with recent Ceph versions anyway.
> rbd_cache_size = 268435456
Are you sure that you have 256MB per client to waste on RBD cache?
If so, bully for you, but you might find that depending on your use case a
smaller RBD cache but more VM memory (for pagecache, SLAB, etc) could be
more beneficial.
Christian
> rbd_cache_max_dirty = 134217728
> rbd_cache_max_dirty_age = 5
--
Christian Balzer Network/Systems Engineer
chibi@xxxxxxx Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com