Dear all;
Up until a few hours ago, I had a seemingly normally-behaving cluster
(Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6
nodes. The cluster is only used for CephFS and the only non-standard
configuration I can think of is that I had 2 active MDSs, but only 1
standby. I had also doubled mds_cache_memory limit to 8 GB (all OSD
hosts have 256 G of RAM) at some point in the past.
Then I rebooted one of the OSD nodes. The rebooted node held one of the
active MDSs. Now the node is back up: ceph -s says the cluster is
healthy, but all PGs are in a active+clean+remapped state and 166.67% of
the objects are misplaced (dashboard: -66.66% healthy).
The data pool is a threefold replica with 5.4M object, the number of
misplaced objects is reported as 27087410/16252446. The denominator in
the ratio makes sense to me (16.2M / 3 = 5.4M), but the numerator does
not. I also note that the ratio is *exactly* 5 / 3. The filesystem is
still mounted and appears to be usable, but df reports it as 100% full;
I suspect it would say 167% but that is capped somewhere.
Any ideas about what is going on? Any suggestions for recovery?
// Best wishes; Johan
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx