Marc, Marc Roos wrote: : Are you sure your osd's are up and reachable? (run ceph osd tree on : another node) They are up, because all three mons see them as up. However, ceph osd tree provided the hint (thanks!): The OSD host went back with hostname "localhost" instead of the correct one for some reason. So the OSDs moved themselves to a new HOST=localhost CRUSH node directly under the CRUSH root. I rebooted the OSD host once again, and it went up again with the correct hostname, and the "ceph osd tree" output looks sane now. So I guess we have a reason for such a huge rebalance. However, even though the OSD tree is back in the normal state, the rebalance is still going on, and there are even inactive PGs, with some Ceph clients being stuck seemingly forever: health: HEALTH_ERR 1964645/3977451 objects misplaced (49.395%) Reduced data availability: 11 pgs inactive Degraded data redundancy: 315678/3977451 objects degraded (7.937%), 542 pgs degraded, 546 pgs undersized Degraded data redundancy (low space): 76 pgs backfill_toofull services: mon: 3 daemons, quorum stratus1,stratus2,stratus3 mgr: stratus3(active), standbys: stratus1, stratus2 osd: 44 osds: 44 up, 44 in; 1806 remapped pgs rgw: 1 daemon active data: pools: 9 pools, 3360 pgs objects: 1.33 M objects, 5.0 TiB usage: 25 TiB used, 465 TiB / 490 TiB avail pgs: 0.327% pgs not active 315678/3977451 objects degraded (7.937%) 1964645/3977451 objects misplaced (49.395%) 1554 active+clean 1226 active+remapped+backfill_wait 482 active+undersized+degraded+remapped+backfill_wait 51 active+undersized+degraded+remapped+backfill_wait+backfill_too full 25 active+remapped+backfill_wait+backfill_toofull 6 activating+remapped 5 active+undersized+remapped+backfill_wait 4 activating+undersized+degraded+remapped 4 active+undersized+degraded+remapped+backfilling 2 active+remapped+backfilling 1 activating+degraded+remapped io: client: 0 B/s rd, 126 KiB/s wr, 0 op/s rd, 5 op/s wr recovery: 52 MiB/s, 13 objects/s # ceph pg ls|grep activating 23.298 622 622 0 0 2591064064 3041 activating+undersized+degraded+remapped 2019-05-15 15:03:04.626434 102870'1371081 103721:1369041 [8,20,70]p8 [8,20]p8 2019-05-15 02:10:34.972050 2019-05-15 02:10:34.972050 23.2cb 695 695 695 0 2885144354 3097 activating+undersized+degraded+remapped 2019-05-15 15:03:04.592438 102890'828931 103721:1594128 [0,70,78]p0 [21,78]p21 2019-05-15 10:23:02.789435 2019-05-14 00:46:19.161050 23.346 623 1 1245 0 2602515968 3076 activating+degraded+remapped 2019-05-15 14:56:05.317986 103083'1061153 103721:3719154 [78,79,26]p78 [26,23,5]p26 2019-05-15 10:21:17.388467 2019-05-15 10:21:17.388467 23.436 664 0 664 0 2767360000 3079 activating+remapped 2019-05-15 15:05:19.349660 103083'987000 103721:1525097 [13,70,19]p13 [13,19,18]p13 2019-05-14 09:43:52.924297 2019-05-08 04:24:41.251620 23.454 696 0 1846 0 2872765970 3031 activating+remapped 2019-05-15 15:05:19.152343 102896'1092297 103721:1607448 [2,69,70]p2 [24,12,75]p24 2019-05-15 14:06:45.123388 2019-05-11 21:53:50.183932 23.490 636 0 636 0 2635874322 3064 activating+remapped 2019-05-15 15:05:19.368037 103083'4996760 103721:1789524 [13,70,1]p13 [13,1,24]p13 2019-05-14 05:16:51.180417 2019-05-09 04:51:52.645295 23.4f5 633 0 1266 0 2641321984 3084 activating+remapped 2019-05-15 14:56:04.248887 103035'4667973 103721:2116544 [70,72,27]p70 [25,27,79]p25 2019-05-15 01:07:28.978979 2019-05-08 07:20:08.253942 23.76b 596 0 1192 0 2481048116 3025 activating+remapped 2019-05-15 15:05:19.135491 102723'1445725 103721:1907186 [70,13,72]p70 [26,13,8]p26 2019-05-14 17:04:13.644789 2019-05-14 17:04:13.644789 23.7e1 604 0 604 0 2517671954 3008 activating+remapped 2019-05-15 14:56:04.246016 102730'739689 103721:1262764 [8,79,21]p8 [8,21,26]p8 2019-05-14 13:57:52.964361 2019-05-13 09:54:51.371622 62.4b 108 794 0 0 74451903 1028 activating+undersized+degraded+remapped 2019-05-15 14:56:04.330268 102517'1028 103721:22340 [79,78,20]p79 [78,20]p78 2019-05-14 16:30:18.090859 2019-05-14 16:30:18.090859 62.4e 118 386 0 0 103058459 1011 activating+undersized+degraded+remapped 2019-05-15 15:05:17.348109 102517'1011 103721:24725 [77,70,19]p77 [77,19]p77 2019-05-15 13:36:55.090172 2019-05-14 08:40:20.383295 -Yenya : From: Jan Kasprzak [mailto:kas@xxxxxxxxxx] : Sent: woensdag 15 mei 2019 14:46 : To: ceph-users@xxxxxxxx : Subject: Huge rebalance after rebooting OSD host (Mimic) : : Hello, Ceph users, : : I wanted to install the recent kernel update on my OSD hosts with CentOS : 7, Ceph 13.2.5 Mimic. So I set a noout flag and ran "yum -y update" on : the first OSD host. This host has 8 bluestore OSDs with data on HDDs and : database on LVs of two SSDs (each SSD has 4 LVs for OSD metadata). : : Everything went OK, so I rebooted this host. After the OSD host : went back online, the cluster went from HEALTH_WARN (noout flag set) to : HEALTH_ERR, and started to rebalance itself, with reportedly almost 60 % : objects misplaced, and some of them degraded. And, of course : backfill_toofull: : : cluster: : health: HEALTH_ERR : 2300616/3975384 objects misplaced (57.872%) : Degraded data redundancy: 74263/3975384 objects degraded : (1.868%), 146 pgs degraded, 122 pgs undersized : Degraded data redundancy (low space): 44 pgs : backfill_toofull : : services: : mon: 3 daemons, quorum stratus1,stratus2,stratus3 : mgr: stratus3(active), standbys: stratus1, stratus2 : osd: 44 osds: 44 up, 44 in; 2022 remapped pgs : rgw: 1 daemon active : : data: : pools: 9 pools, 3360 pgs : objects: 1.33 M objects, 5.0 TiB : usage: 25 TiB used, 465 TiB / 490 TiB avail : pgs: 74263/3975384 objects degraded (1.868%) : 2300616/3975384 objects misplaced (57.872%) : 1739 active+remapped+backfill_wait : 1329 active+clean : 102 active+recovery_wait+remapped : 76 active+undersized+degraded+remapped+backfill_wait : 31 active+remapped+backfill_wait+backfill_toofull : 30 active+recovery_wait+undersized+degraded+remapped : 21 active+recovery_wait+degraded+remapped : 8 : active+undersized+degraded+remapped+backfill_wait+backfill_toofull : 6 active+recovery_wait+degraded : 4 active+remapped+backfill_toofull : 3 active+recovery_wait+undersized+degraded : 3 active+remapped+backfilling : 2 active+recovery_wait : 2 active+recovering+undersized : 1 active+clean+remapped : 1 active+undersized+degraded+remapped+backfill_toofull : 1 active+undersized+degraded+remapped+backfilling : 1 active+recovering+undersized+remapped : : io: : client: 681 B/s rd, 1013 KiB/s wr, 0 op/s rd, 32 op/s wr : recovery: 142 MiB/s, 93 objects/s : : (note that I cleaned the noout flag afterwards). What is wrong with it? : Why did the cluster decided to rebalance itself? : : I am keeping the rest of the OSD hosts unrebooted for now. : : Thanks, : : -Yenya : : -- : | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - : private}> | : | http://www.fi.muni.cz/~kas/ GPG: : 4096R/A45477D5 | : sir_clive> I hope you don't mind if I steal some of your ideas? : laryross> As far as stealing... we call it sharing here. --from : rcgroups : _______________________________________________ : ceph-users mailing list : ceph-users@xxxxxxxxxxxxxx : http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com : -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | sir_clive> I hope you don't mind if I steal some of your ideas? laryross> As far as stealing... we call it sharing here. --from rcgroups _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com