The Mon store is important and since your cluster isn't healthy, they need to hold onto it to make sure that when things come up that the mon can replay everything for them. Once
you fix the 2 down and peering PGs, The mon store will fix itself in no time at all. Ceph is rightly refusing to compact that database until your cluster is healthy.
It seems like you have a couple things that might help your setup. First I see something very easy to resolve, and that's the blocked requests. Try running the following command: ceph osd down 71 That command will tell the cluster that osd.71 is down without restarting the actual osd daemon. Osd.71 will come back and tell the mons it's actually up, but in the mean time the operations blocking on osd.71 will go to a secondary to get the response and clear up. Second, osd.53 looks to be causing the never ending peering. A couple questions to check things here. What is your osd_max_backfills set to? That is directly related to how fast osd.53 will fill back up. Something you might do to speed that up is to just inject a higher setting for osd.53 and not the rest of the cluster: ceph tell osd.53 injectargs '--osd_max_backfills=20' If this is the problem and the cluster is just waiting for osd.53 to finish backfilling, then this will get you there faster. I'm unfamiliar with the strategy you used to rebuild the data for osd.53. I would have removed the osd from the cluster and added it back in with the same weight. That way the osd would start right away and you would see the pgs backfilling onto the osd as opposed to it sitting in a perpetual "booting" state. To remove the osd with minimal impact to the cluster, the following commands should get you there. ceph osd tree | grep 'osd.53 ' ceph osd set nobackfill ceph osd set norecover #on the host with osd.53, stop the daemon ceph osd down 53 ceph osd out 53 ceph osd crush remove osd.53 ceph auth rm osd.53 ceph osd rm 53 At this point osd.53 is completely removed from the cluster and you have the original weight of the osd to set it to when you bring the osd back in. The down and peering PGs should now be resolved. Now, completely re-format and add the osd back into the cluster. Make sure to do whatever you need for dmcrypt, journals, etc that are specific to your environment. Once the osd is back in the cluster, up and in, reweight the osd to what it was before you removed it and unset norecover and nobackfill. ceph osd crush reweight osd.53 {{ weight_from_tree_command }} ceph osd unset nobackfill ceph osd unset norecover At this point everything is back to the way it was and the osd should start receiving data. The only data movement should be refilling osd.53 with the data it used to have and everything else should stay the same. Increasing the backfills for this osd will help it fill up faster, but it will be slower for client io if you do. The mon stores will remain "too big" until after backfilling onto osd.53 finishes, but once the data stops moving around and all of your osds are up and in, the mon stores will compact in no time. I hope this helps. Ask questions if you have any, and never run a command on your cluster that you don't understand. David Turner From: ceph-users [ceph-users-bounces@xxxxxxxxxxxxxx] on behalf of Salwasser, Zac [zsalwass@xxxxxxxxxx]
Sent: Thursday, July 21, 2016 12:54 PM To: ceph-users@xxxxxxxxxxxxxx Cc: Heller, Chris Subject: Uncompactable Monitor Store at 69GB -- Re: Cluster in warn state, not sure what to do next. Rephrasing for brevity – I have a monitor store that is 69GB and won’t compact any further on restart or with ‘tell compact’. Has anyone dealt with this before?
From:
"Salwasser, Zac" <zsalwass@xxxxxxxxxx>
|
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com