SSD Recovery Settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I setup an SSD Luminous 12.2.11 cluster and realized after data had been added that pg_num was not set properly on the default.rgw.buckets.data pool ( where all the data goes ).  I adjusted the settings up, but recovery is going really slow ( like 56-110MiB/s ) ticking down at .002 per log entry(ceph –w).  These are all SSDs on luminous 12.2.11 ( no journal drives ) with a set of 2 10Gb fiber twinax in a bonded LACP config.  There are six servers, 60 OSDs, each OSD is 2TB.  There was about 4TB of data ( 3 million objects ) added to the cluster before I noticed the red blinking lights…

 

I tried adjusting the recovery to:

ceph tell 'osd.*' injectargs '--osd-max-backfills 16'

ceph tell 'osd.*' injectargs '--osd-recovery-max-active 30'

 

Which did help a little, but didn’t seem to have the impact I was looking for.  I have used the settings on HDD clusters before to speed things up ( using 8 backfills and 4 max active though ).  Did I miss something or is this part of the pg expansion process.  Should I be doing something else with SSD clusters?

 

Regards,

-Brent

 

Existing Clusters:

Test: Luminous 12.2.11 with 3 osd servers, 1 mon/man, 1 gateway ( all virtual on SSD )

US Production(HDD): Jewel 10.2.11 with 5 osd servers, 3 mons, 3 gateways behind haproxy LB

UK Production(HDD): Luminous 12.2.11 with 15 osd servers, 3 mons/man, 3 gateways behind haproxy LB

US Production(SSD): Luminous 12.2.11 with 6 osd servers, 3 mons/man, 3 gateways behind haproxy LB

 

 

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux