Hi,
on a 3 node hyper converged pve cluster with 12 SSD osd devices I do
experience stalls in the rbd performance during normal backfill
operations e.g. moving a pool from 2/1 to 3/2.
I was expecting that I could control the load caused by the backfilling
using
ceph tell 'osd.*' injectargs '--osd-max-backfills 1'
or
ceph tell 'osd.*' injectargs '--osd-recovery-max-active 1'
even
ceph tell 'osd.*' config set osd_recovery_sleep_ssd 2.1
did not help.
Any hints?
Normal operation looks like:
2022-11-19T16:16:52.142355+0000 mgr.pve-02 (mgr.18134134) 60414 :
cluster [DBG] pgmap v59642: 576 pgs: 576 active+clean; 2.4 TiB data, 4.7
TiB used, 12 TiB / 16 TiB avail; 3.3 KiB/s rd, 2.7 MiB/s wr, 63 op/s
2022-11-19T16:16:54.144082+0000 mgr.pve-02 (mgr.18134134) 60416 :
cluster [DBG] pgmap v59643: 576 pgs: 576 active+clean; 2.4 TiB data, 4.7
TiB used, 12 TiB / 16 TiB avail; 2.7 KiB/s rd, 1.3 MiB/s wr, 56 op/s
I am running Ceph Quincy 17.2.5 on a test system with dedicated
1Gbit/9000MTU storage network, while the public ceph network
1GBit/1500MTU is shared with the vm network.
I am looking forward to you suggestions.
Regards,
ppa. Martin Konold
--
Martin Konold - Prokurist, CTO
KONSEC GmbH - make things real
Amtsgericht Stuttgart, HRB 23690
Geschäftsführer: Andreas Mack
Im Köller 3, 70794 Filderstadt, Germany
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx