Re: Optimal or recommended threads values

Craig Lewis <clewis@xxxxxxxxxxxxxxxxxx> · Mon, 24 Nov 2014 16:33:20 -0800

Tuning these values depends on a lot more than just the SSDs and HDDs.  Which kernel and IO scheduler are you using?  Does your HBA do write caching?  It also depends on what your goals are.  Tuning for a RadosGW cluster is different that for a RDB cluster.  The short answer is that you are the only person that can can tell you what your optimal values are.  As always, the best benchmark is production load.

In my small cluster (5 nodes, 44 osds), I'm optimizing to minimize latency during recovery.  When the cluster is healthy, bandwidth and latency are more than adequate for my needs.  Even with journals on SSDs, I've found that reducing the number of operations and threads has reduced my average latency.

I use injectargs to try out new values while I monitor cluster latency.  I monitor latency while the cluster is healthy and recovering.  If a change is deemed better, only then will I persist the change to ceph.conf.  This gives me a fallback that any changes that causes massive problems can be undone with a restart or reboot.

So far, the configs that I've written to ceph.conf are
[global]
  mon osd down out interval = 900
  mon osd min down reporters = 9
  mon osd min down reports = 12
  osd pool default flag hashpspool = true

[osd]
  osd max backfills = 1
  osd recovery max active = 1
  osd recovery op priority = 1

I have it on my list to investigate filestore max sync interval.  And now that I've pasted that, I need to revisit the min down reports/reporters.  I have some nodes with 10 OSDs, and I don't want any one node able to mark the rest of the cluster as down (it happened once).

On Sat, Nov 22, 2014 at 6:24 AM, Andrei Mikhailovsky <andrei@xxxxxxxxxx> wrote:
Hello guys,

Could some one comment on the optimal or recommended values of various threads values in ceph.conf?

At the moment I have the following settings:

filestore_op_threads = 8
osd_disk_threads = 8
osd_op_threads = 8
filestore_merge_threshold = 40

filestore_split_multiple = 8

Are these reasonable for a small cluster made of 7.2K SAS disks with ssd journals with a ratio of 4:1?

What are the settings that other people are using?

Thanks

Andrei

_______________________________________________

ceph-users mailing list

ceph-users@xxxxxxxxxxxxxx

http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com