You probably want to consider increasing osd max backfills
You should be able to inject this online
You might want to drop your osd recovery max active settings back down to around 2 or 3, although with it being SSD your performance will probably be fine.
On Fri, 5 Jan 2018 at 20:13 Stefan Kooman <stefan@xxxxxx> wrote:
Hi,
I know I'm not the only one with this question as I have see similar questions on this list:
How to speed up recovery / backfilling?
Current status:
pgs: 155325434/800312109 objects degraded (19.408%)
1395 active+clean
440 active+undersized+degraded+remapped+backfill_wait
21 active+undersized+degraded+remapped+backfilling
io:
client: 180 kB/s rd, 5776 kB/s wr, 273 op/s rd, 440 op/s wr
recovery: 2990 kB/s, 109 keys/s, 114 objects/s
What we did? Shutdown one DC. Fill cluster with loads of objects, turn
DC back on (size = 3, min_size=2). To test exactly this: recovery.
I have been going trough all the recovery options (including legacy) but
I cannot get the recovery speed to increase:
osd_recovery_op_priority 63
osd_client_op_priority 3
^^ yup, reversed those, to no avail
osd_recovery_max_active 10'
^^ This helped for a short period of time, and then it went back to
"slow" mode
osd_recovery_max_omap_entries_per_chunk 0
osd_recovery_max_chunk 67108864
Haven't seen any change in recovery speed.
osd_recovery_sleep_ssd": "0.000000
^^ default for SSD
The whole cluster is idle, ODSs have very low load. What can be the
reason for the slow recovery? Something is holding it back but I cannot
think of what.
Ceph Luminous 12.2.2 (bluestore on lvm, all SSD)
Thanks,
Stefan
--
| BIT BV http://www.bit.nl/ Kamer van Koophandel 09090351
| GPG: 0xD14839C6 +31 318 648 688 / info@xxxxxx
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com