Hi Wido, AFAIK mon's won't trim while a cluster is in HEALTH_WARN. Unset noscrub,nodeep-scrub, get that 3rd mon up, then it should trim. -- Dan On Thu, Nov 3, 2016 at 10:40 AM, Wido den Hollander <wido@xxxxxxxx> wrote: > Hi, > > After finally resolving the remapped PGs [0] I'm running into a a problem where the MON stores are not trimming. > > health HEALTH_WARN > noscrub,nodeep-scrub flag(s) set > 1 mons down, quorum 0,1 1,2 > mon.1 store is getting too big! 37115 MB >= 15360 MB > mon.2 store is getting too big! 26327 MB >= 15360 MB > > At first I thought it was due to the remapped PGs and the cluster not being active+clean, but after this was resolved the stores wouldn't trim. Not even when a compact was forced. > > I tried to force a sync of one of the MONs, that works, but it seems that the Paxos entries are not trimmed from the store. > > A snippet of log from the Mon which is syncing: > > 2016-11-03 10:18:05.354643 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout > 2016-11-03 10:18:05.368222 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 448496 bytes last_key paxos,13242098) v2 > 2016-11-03 10:18:05.368229 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 448496 bytes last_key paxos,13242098) v2 > 2016-11-03 10:18:05.379160 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout > 2016-11-03 10:18:05.387253 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 2512885 bytes last_key paxos,13242099) v2 > 2016-11-03 10:18:05.387260 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 2512885 bytes last_key paxos,13242099) v2 > 2016-11-03 10:18:05.409084 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout > 2016-11-03 10:18:05.424569 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 804569 bytes last_key paxos,13242142) v2 > 2016-11-03 10:18:05.424576 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 804569 bytes last_key paxos,13242142) v2 > 2016-11-03 10:18:05.435102 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout > 2016-11-03 10:18:05.442261 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 2522418 bytes last_key paxos,13242143) v2 > 2016-11-03 10:18:05.442270 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 2522418 bytes last_key paxos,13242143) v2 > > In the tracker [1] I found a issue which looks like it, but that issue was resolved over 3 years ago. > > Looking at mon.1 for example: > > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db# ls|wc -l > 12769 > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db# du -sh . > 37G . > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db# > > To clarify, these Monitors already had their big data store under Dumpling and were recently upgraded to Firefly and Hammer. > > All PGs are active+clean at the moment, but it seems that the MON stores mainly contain the Paxos entries which are not trimmed. > > root@mon3:/var/lib/ceph/mon# ceph-monstore-tool ceph-mon3 dump-keys|awk '{print $1}'|uniq -c > 96 auth > 1143 logm > 3 mdsmap > 1 mkfs > 1 mon_sync > 6 monitor > 3 monmap > 1158 osdmap > 358364 paxos > 656 pgmap > 6 pgmap_meta > 168 pgmap_osd > 6144 pgmap_pg > root@mon3:/var/lib/ceph/mon# > > So there are 358k Paxos entries in the Mon store. > > Any suggestions on how to trim those from the MON store(s)? > > Wido > > > [0]: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-November/014113.html > [1]: http://tracker.ceph.com/issues/4895 > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com