On Tue, Jan 10, 2017 at 8:23 AM, Marcus Müller <mueller.marcus@xxxxxxxxx> wrote:
Hi all,Recently I added a new node with new osds to my cluster, which, of course resulted in backfilling. At the end, there are 4 pgs left in the state 4 active+remapped and I don’t know what to do.Here is how my cluster looks like currently:ceph -shealth HEALTH_WARN4 pgs stuck uncleanrecovery 3586/58734009 objects degraded (0.006%)recovery 420074/58734009 objects misplaced (0.715%)noscrub,nodeep-scrub flag(s) setmonmap e9: 5 mons at {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0, }ceph3=192.168.10.5:6789/0, ceph4=192.168.60.6:6789/0, ceph5=192.168.60.11:6789/0 election epoch 478, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5osdmap e3114: 9 osds: 9 up, 9 in; 4 remapped pgsflags noscrub,nodeep-scrubpgmap v9970276: 320 pgs, 3 pools, 4831 GB data, 19119 kobjects15152 GB used, 40719 GB / 55872 GB avail3586/58734009 objects degraded (0.006%)420074/58734009 objects misplaced (0.715%)316 active+clean4 active+remappedclient io 643 kB/s rd, 7 op/s# ceph osd dfID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR0 1.28899 1.00000 3724G 1697G 2027G 45.57 1.681 1.57899 1.00000 3724G 1706G 2018G 45.81 1.692 1.68900 1.00000 3724G 1794G 1929G 48.19 1.783 6.78499 1.00000 7450G 1240G 6209G 16.65 0.614 8.39999 1.00000 7450G 1226G 6223G 16.47 0.615 9.51500 1.00000 7450G 1237G 6212G 16.62 0.616 7.66499 1.00000 7450G 1264G 6186G 16.97 0.637 9.75499 1.00000 7450G 2494G 4955G 33.48 1.238 9.32999 1.00000 7450G 2491G 4958G 33.45 1.23TOTAL 55872G 15152G 40719G 27.12MIN/MAX VAR: 0.61/1.78 STDDEV: 13.54# ceph health detailHEALTH_WARN 4 pgs stuck unclean; recovery 3586/58734015 objects degraded (0.006%); recovery 420074/58734015 objects misplaced (0.715%); noscrub,nodeep-scrub flag(s) setpg 9.7 is stuck unclean for 512936.160212, current state active+remapped, last acting [7,3,0]pg 7.84 is stuck unclean for 512623.894574, current state active+remapped, last acting [4,8,1]pg 8.1b is stuck unclean for 513164.616377, current state active+remapped, last acting [4,7,2]pg 7.7a is stuck unclean for 513162.316328, current state active+remapped, last acting [7,4,2]recovery 3586/58734015 objects degraded (0.006%)recovery 420074/58734015 objects misplaced (0.715%)noscrub,nodeep-scrub flag(s) set# ceph osd treeID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY-1 56.00693 root default-2 1.28899 host ceph10 1.28899 osd.0 up 1.00000 1.00000-3 1.57899 host ceph21 1.57899 osd.1 up 1.00000 1.00000-4 1.68900 host ceph32 1.68900 osd.2 up 1.00000 1.00000-5 32.36497 host ceph43 6.78499 osd.3 up 1.00000 1.000004 8.39999 osd.4 up 1.00000 1.000005 9.51500 osd.5 up 1.00000 1.000006 7.66499 osd.6 up 1.00000 1.00000-6 19.08498 host ceph57 9.75499 osd.7 up 1.00000 1.000008 9.32999 osd.8 up 1.00000 1.00000I’m using a customized crushmap because as you can see this cluster is not very optimal. Ceph1, ceph2 and ceph3 are vms on one physical host - Ceph4 and Ceph5 are both separate physical hosts. So the idea is to spread 33% of the data to ceph1, ceph2 and ceph3 and the other 66% to each ceph4 and ceph5.Everything went fine with the backfilling but now I see those 4 pgs stuck active+remapped since 2 days while the degrades objects increase.I did a restart of all osds after and after but this helped not really. It first showed me no degraded objects and then increased again.What can I do in order to get those pgs to active+clean state again? My idea was to increase the weight of a osd a little bit in order to let ceph calculate the map again, is this a good idea?
Trying google with "ceph pg stuck in active and remapped" points to a couple of post on this ML typically indicating that it's a problem with the CRUSH map and ceph being unable to satisfy the mapping rules. Your ceph -s output indicates that your using replication of size 3 in your pools. You also said you had a custom CRUSH map - can you post it?
---On the other side I saw something very strange too: After the backfill was done (2 days ago), my ceph osd df looked like this:# ceph osd dfID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR0 1.28899 1.00000 3724G 1924G 1799G 51.67 1.791 1.57899 1.00000 3724G 2143G 1580G 57.57 2.002 1.68900 1.00000 3724G 2114G 1609G 56.78 1.973 6.78499 1.00000 7450G 1234G 6215G 16.57 0.584 8.39999 1.00000 7450G 1221G 6228G 16.40 0.575 9.51500 1.00000 7450G 1232G 6217G 16.54 0.576 7.66499 1.00000 7450G 1258G 6191G 16.89 0.597 9.75499 1.00000 7450G 2482G 4967G 33.33 1.168 9.32999 1.00000 7450G 2480G 4969G 33.30 1.16TOTAL 55872G 16093G 39779G 28.80MIN/MAX VAR: 0.57/2.00 STDDEV: 17.54While ceph -s was:health HEALTH_WARN4 pgs stuck uncleanrecovery 1698/58476648 objects degraded (0.003%)recovery 418137/58476648 objects misplaced (0.715%)noscrub,nodeep-scrub flag(s) setmonmap e9: 5 mons at {ceph1=192.168.10.3:6789/0,ceph2=192.168.10.4:6789/0, }ceph3=192.168.10.5:6789/0, ceph4=192.168.60.6:6789/0, ceph5=192.168.60.11:6789/0 election epoch 464, quorum 0,1,2,3,4 ceph1,ceph2,ceph3,ceph4,ceph5osdmap e3086: 9 osds: 9 up, 9 in; 4 remapped pgsflags noscrub,nodeep-scrubpgmap v9928160: 320 pgs, 3 pools, 4809 GB data, 19035 kobjects16093 GB used, 39779 GB / 55872 GB avail1698/58476648 objects degraded (0.003%)418137/58476648 objects misplaced (0.715%)316 active+clean4 active+remappedclient io 757 kB/s rd, 1 op/sAs you can see above my ceph osd df looks completely different -> This shows that the first three osds lost data (about 1 TB) without any backfill going on. If I calculate the amount of osd0, osd1 and osd2 it was 6181 GB. But there should be only around 33%, so this would be wrong.
I might be missing something here but I don't quite see how you come to this statement. ceph osd df and ceph -s both show 16093 GB used and 39779 GB out of 55872 GB available. The sum of the first 3 OSDs used space is, as you stated, 6181 GB which is approx 38.4% so quite close to your target of 33%
My question on this is: Is this a bug and I really lost important data or is this a ceph cleanup action after the backfill?Thanks and regards,Marcus
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com