Re: thanks for a double check on ceph's config

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

On Tue, 10 May 2016 16:50:17 +0800 Geocast Networks wrote:

> Hello Chris,
> 
> We don't use SSD as journal.
> each host has one intel E5-2620 CPU which is 6 cores.
That should be enough.

> the networking (both cluster and data networks) is 10Gbps.
>
12 HDDs will barely saturate a 10Gb/s link during writes, if you care
about fast reads you may be better off with a uniform, bonded 20Gb/s
network. 
 
> My further questions include,
> 
> (1) osd_mkfs_type = xfs
> osd_mkfs_options_xfs = -f
> filestore_xattr_use_omap = true
> 
> for XFS filesystem, we should not enable filestore_xattr_use_omap = true,
> is it?
> 
You don't need to, AFAIK this switch doesn't cause any overhead if it isn't
needed.
Somebody actually using XFS or knowing the code may pipe up here.

> (2) filestore_queue_max_ops = 25000
> filestore_queue_max_bytes = 10485760
> filestore_queue_committing_max_ops = 5000
> filestore_queue_committing_max_bytes = 10485760000
> journal_max_write_bytes = 1073714824
> journal_max_write_entries = 10000
> journal_queue_max_ops = 50000
> journal_queue_max_bytes = 10485760000
> 
> Since we don't have SSD as journals, all these setup are too large? what
> are the better values?
> 
You really want to test them against the defaults.

And the defaults are designed for usage with HDD only OSDs, so they are
probably your best bet unless you feel like empiric testing.

> (3) osd_mount_options_xfs =
> "rw,noexec,nodev,noatime,nodiratime,nobarrier" What's your suggested
> options here?
> 

As I said, loose the "nobarrier".

Christian

> Thanks a lot.
> 
> 
> 2016-05-10 15:31 GMT+08:00 Christian Balzer <chibi@xxxxxxx>:
> 
> > On Tue, 10 May 2016 11:48:07 +0800 Geocast wrote:
> >
> > Hello,
> >
> > > We have 21 hosts for ceph OSD servers, each host has 12 SATA disks
> > > (4TB each), 64GB memory.
> > No journal SSDs?
> > What CPU(s) and network?
> >
> > > ceph version 10.2.0, Ubuntu 16.04 LTS
> > > The whole cluster is new installed.
> > >
> > > Can you help check what the arguments we put in ceph.conf is
> > > reasonable or not?
> > > thanks.
> > >
> > > [osd]
> > > osd_data = /var/lib/ceph/osd/ceph-$id
> > > osd_journal_size = 20000
> > Overkill most likely, but not an issue.
> >
> > > osd_mkfs_type = xfs
> > > osd_mkfs_options_xfs = -f
> > > filestore_xattr_use_omap = true
> > > filestore_min_sync_interval = 10
> > Are you aware what this does and have you actually tested this (IOPS
> > AND throughput) with various other setting on your hardware to arrive
> > at this number?
> >
> > > filestore_max_sync_interval = 15
> > That's fine in and by itself, unlikely to ever be reached anyway.
> >
> > > filestore_queue_max_ops = 25000
> > > filestore_queue_max_bytes = 10485760
> > > filestore_queue_committing_max_ops = 5000
> > > filestore_queue_committing_max_bytes = 10485760000
> > > journal_max_write_bytes = 1073714824
> > > journal_max_write_entries = 10000
> > > journal_queue_max_ops = 50000
> > > journal_queue_max_bytes = 10485760000
> > Same as above, have you tested these setting (from
> > filestore_queue_max_ops onward) compared to the defaults?
> >
> > With HDDs only I'd expect any benefits to be small and/or things to
> > become very uneven once the HDDs are saturated.
> >
> > > osd_max_write_size = 512
> > > osd_client_message_size_cap = 2147483648
> > > osd_deep_scrub_stride = 131072
> > > osd_op_threads = 8
> > > osd_disk_threads = 4
> > > osd_map_cache_size = 1024
> > > osd_map_cache_bl_size = 128
> > > osd_mount_options_xfs =
> > > "rw,noexec,nodev,noatime,nodiratime,nobarrier"
> > The nobarrier part is a a potential recipe for disaster unless you
> > have all on-disk caches disabled and every other cache battery backed.
> >
> > The only devices I trust to mount nobarrier are SSDs with powercaps
> > that have been proven to do the right thing (Intel DC S amongst them).
> >
> > > osd_recovery_op_priority = 4
> > > osd_recovery_max_active = 10
> > > osd_max_backfills = 4
> > >
> > That's sane enough.
> >
> > > [client]
> > > rbd_cache = true
> > AFAIK that's the case with recent Ceph versions anyway.
> >
> > > rbd_cache_size = 268435456
> >
> > Are you sure that you have 256MB per client to waste on RBD cache?
> > If so, bully for you, but you might find that depending on your use
> > case a smaller RBD cache but more VM memory (for pagecache, SLAB, etc)
> > could be more beneficial.
> >
> > > rbd_cache_max_dirty = 134217728
> > > rbd_cache_max_dirty_age = 5
> >
> > Christian
> > --
> > Christian Balzer        Network/Systems Engineer
> > chibi@xxxxxxx           Global OnLine Japan/Rakuten Communications
> > http://www.gol.com/
> >


-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux