Re: Monitors stores not trimming after upgrade from Dumpling to Hammer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Op 3 november 2016 om 10:46 schreef Wido den Hollander <wido@xxxxxxxx>:
> 
> 
> 
> > Op 3 november 2016 om 10:42 schreef Dan van der Ster <dan@xxxxxxxxxxxxxx>:
> > 
> > 
> > Hi Wido,
> > 
> > AFAIK mon's won't trim while a cluster is in HEALTH_WARN. Unset
> > noscrub,nodeep-scrub, get that 3rd mon up, then it should trim.
> > 
> 
> The 3rd MON is back, but afaik the MONs trim when all PGs are active+clean. A cluster can go into WARN state for almost any reason, eg old CRUSH tunables.
> 
> Will give it a try though.

No, it didn't. Health is OK, but still, the MON stores will not trim. A manual compaction actually grew the store from 25GB to 39GB.

They keep having a high amount of Paxos keys in the MON stores.

Wido

> 
> Wido
> 
> > -- Dan
> > 
> > 
> > On Thu, Nov 3, 2016 at 10:40 AM, Wido den Hollander <wido@xxxxxxxx> wrote:
> > > Hi,
> > >
> > > After finally resolving the remapped PGs [0] I'm running into a a problem where the MON stores are not trimming.
> > >
> > >      health HEALTH_WARN
> > >             noscrub,nodeep-scrub flag(s) set
> > >             1 mons down, quorum 0,1 1,2
> > >             mon.1 store is getting too big! 37115 MB >= 15360 MB
> > >             mon.2 store is getting too big! 26327 MB >= 15360 MB
> > >
> > > At first I thought it was due to the remapped PGs and the cluster not being active+clean, but after this was resolved the stores wouldn't trim. Not even when a compact was forced.
> > >
> > > I tried to force a sync of one of the MONs, that works, but it seems that the Paxos entries are not trimmed from the store.
> > >
> > > A snippet of log from the Mon which is syncing:
> > >
> > > 2016-11-03 10:18:05.354643 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout
> > > 2016-11-03 10:18:05.368222 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 448496 bytes last_key paxos,13242098) v2
> > > 2016-11-03 10:18:05.368229 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 448496 bytes last_key paxos,13242098) v2
> > > 2016-11-03 10:18:05.379160 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout
> > > 2016-11-03 10:18:05.387253 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 2512885 bytes last_key paxos,13242099) v2
> > > 2016-11-03 10:18:05.387260 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 2512885 bytes last_key paxos,13242099) v2
> > > 2016-11-03 10:18:05.409084 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout
> > > 2016-11-03 10:18:05.424569 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 804569 bytes last_key paxos,13242142) v2
> > > 2016-11-03 10:18:05.424576 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 804569 bytes last_key paxos,13242142) v2
> > > 2016-11-03 10:18:05.435102 7f6f90988700 10 mon.3@2(synchronizing) e1 sync_reset_timeout
> > > 2016-11-03 10:18:05.442261 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync mon_sync(chunk cookie 3288334339 lc 174329061 bl 2522418 bytes last_key paxos,13242143) v2
> > > 2016-11-03 10:18:05.442270 7f6f90988700 10 mon.3@2(synchronizing) e1 handle_sync_chunk mon_sync(chunk cookie 3288334339 lc 174329061 bl 2522418 bytes last_key paxos,13242143) v2
> > >
> > > In the tracker [1] I found a issue which looks like it, but that issue was resolved over 3 years ago.
> > >
> > > Looking at mon.1 for example:
> > >
> > > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db# ls|wc -l
> > > 12769
> > > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db# du -sh .
> > > 37G     .
> > > root@mon1:/var/lib/ceph/mon/ceph-mon1/store.db#
> > >
> > > To clarify, these Monitors already had their big data store under Dumpling and were recently upgraded to Firefly and Hammer.
> > >
> > > All PGs are active+clean at the moment, but it seems that the MON stores mainly contain the Paxos entries which are not trimmed.
> > >
> > > root@mon3:/var/lib/ceph/mon# ceph-monstore-tool ceph-mon3 dump-keys|awk '{print $1}'|uniq -c
> > >      96 auth
> > >    1143 logm
> > >       3 mdsmap
> > >       1 mkfs
> > >       1 mon_sync
> > >       6 monitor
> > >       3 monmap
> > >    1158 osdmap
> > >  358364 paxos
> > >     656 pgmap
> > >       6 pgmap_meta
> > >     168 pgmap_osd
> > >    6144 pgmap_pg
> > > root@mon3:/var/lib/ceph/mon#
> > >
> > > So there are 358k Paxos entries in the Mon store.
> > >
> > > Any suggestions on how to trim those from the MON store(s)?
> > >
> > > Wido
> > >
> > >
> > > [0]: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-November/014113.html
> > > [1]: http://tracker.ceph.com/issues/4895
> > > _______________________________________________
> > > ceph-users mailing list
> > > ceph-users@xxxxxxxxxxxxxx
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux