Hi all, there have been many reports about too slow backfill lately and mostly they seemed related to a problem with mclock ops scheduling n quincy. The hallmark was that backfill started fast and then slowed down a lot. I now make the same observation on an octopus cluster with wpq and it looks very suspicious of a problem with scheduling backfill operations. Here what I see: We added 95 disks to a set of disks shared by 2 pools. This is about 8% of the total number of disks and they were distributed over all 12 OSD hosts. The 2 pools are 8+2 and 8+3 EC fs-data pools. Initially the backfill was as fast as expected, but over the last day was really slow (compared with expectation). Only 33 PGs were backfilling. I have osd_max_backfills=3 and a simple estimate says there should be between 100 - 200 PGs backfilling. To speed things up, I increased osd_max_backfills=5 and the number of backfilling PGs jumped right up to over 200. That's way more than the relative increase would warrant. Just to check, I set osd_max_backfills=3 again to see if the number of PGs goes back to about 30 again. But no! Now I have 142 PGs backfilling, as expected. This looks very much like PGs eligible for backfill don't start or backfill reservations are removed for some reason. Can anyone help out here what might be the problem? I don't want to start a cron job to set osd_max_backfills up and down. There must be something else at play here. Output of ceph status and config set commands below. The number of backfilling PGs is decreasing again and I would really like this to be stable by itself. To give an idea of the problem, we talk here about a rebalancing taking either 2 weeks or 2 months. That's not a bagatelle issue. Thanks and best regards, Frank [root@gnosis ~]# ceph config dump | sed -e "s/ */ /g" | grep :hdd | grep osd_max_backfills osd class:hdd advanced osd_max_backfills 3 [root@gnosis ~]# ceph status cluster: id: ### health: HEALTH_OK services: mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 7d) mgr: ceph-25(active, since 10w), standbys: ceph-03, ceph-02, ceph-01, ceph-26 mds: con-fs2:8 4 up:standby 8 up:active osd: 1260 osds: 1260 up (since 2d), 1260 in (since 2d); 6487 remapped pgs task status: data: pools: 14 pools, 25065 pgs objects: 1.49G objects, 2.8 PiB usage: 3.4 PiB used, 9.7 PiB / 13 PiB avail pgs: 2466697364/12910834502 objects misplaced (19.106%) 18571 active+clean 6453 active+remapped+backfill_wait 34 active+remapped+backfilling 7 active+clean+snaptrim io: client: 30 MiB/s rd, 221 MiB/s wr, 1.08k op/s rd, 1.54k op/s wr recovery: 1.0 GiB/s, 380 objects/s [root@gnosis ~]# ceph config set osd/class:hdd osd_max_backfills 5 [root@gnosis ~]# ceph status cluster: id: ### health: HEALTH_OK services: mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 7d) mgr: ceph-25(active, since 10w), standbys: ceph-03, ceph-02, ceph-01, ceph-26 mds: con-fs2:8 4 up:standby 8 up:active osd: 1260 osds: 1260 up (since 2d), 1260 in (since 2d); 6481 remapped pgs task status: data: pools: 14 pools, 25065 pgs objects: 1.49G objects, 2.8 PiB usage: 3.4 PiB used, 9.7 PiB / 13 PiB avail pgs: 2466120124/12911195308 objects misplaced (19.101%) 18574 active+clean 6247 active+remapped+backfill_wait 234 active+remapped+backfilling 6 active+clean+snaptrim 2 active+clean+scrubbing+deep 2 active+clean+scrubbing io: client: 34 MiB/s rd, 236 MiB/s wr, 1.28k op/s rd, 2.03k op/s wr recovery: 6.4 GiB/s, 2.39k objects/s [root@gnosis ~]# ceph config set osd/class:hdd osd_max_backfills 3 [root@gnosis ~]# ceph status cluster: id: ### health: HEALTH_OK services: mon: 5 daemons, quorum ceph-01,ceph-02,ceph-03,ceph-25,ceph-26 (age 7d) mgr: ceph-25(active, since 10w), standbys: ceph-03, ceph-02, ceph-01, ceph-26 mds: con-fs2:8 4 up:standby 8 up:active osd: 1260 osds: 1260 up (since 2d), 1260 in (since 2d); 6481 remapped pgs task status: data: pools: 14 pools, 25065 pgs objects: 1.49G objects, 2.8 PiB usage: 3.4 PiB used, 9.7 PiB / 13 PiB avail pgs: 2465974875/12911218789 objects misplaced (19.099%) 18578 active+clean 6339 active+remapped+backfill_wait 142 active+remapped+backfilling 6 active+clean+snaptrim io: client: 32 MiB/s rd, 247 MiB/s wr, 1.10k op/s rd, 1.57k op/s wr recovery: 4.2 GiB/s, 1.56k objects/s ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx