Re: Why is my mon store.db is 220GB?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 13 Aug 2013, Jeppesen, Nelson wrote:
> Is there an easy way I can find the age and/or expiration of the service ticket on a particular osd? Is that a file or just kept in ram?

It's just in ram.  If you crank up debug auth = 10 you will periodically 
see it dump the rotating keys and expirations.  Ideally the middle one 
will remain valid, but things won't grind to a halt until they are all 
expired.

sage

 > 
> 
> -----Original Message-----
> From: Sage Weil [mailto:sage@xxxxxxxxxxx] 
> Sent: Tuesday, August 13, 2013 9:01 AM
> To: Jeppesen, Nelson
> Cc: ceph-users@xxxxxxxxxxxxxx
> Subject: RE:  Why is my mon store.db is 220GB?
> 
> On Tue, 13 Aug 2013, Jeppesen, Nelson wrote:
> > Interesting,
> > 
> > So if I change ' auth service ticket ttl' to 172,800, in theory I could go without a monitor for 48 hours?
> 
> If there are no up/down events, no new clients need to start, no osd recovery going on, then I *think* so.  I may be forgetting something.
> 
> sage
> 
> 
> > 
> > 
> > -----Original Message-----
> > From: Sage Weil [mailto:sage@xxxxxxxxxxx]
> > Sent: Monday, August 12, 2013 9:50 PM
> > To: Jeppesen, Nelson
> > Cc: ceph-users@xxxxxxxxxxxxxx
> > Subject: Re:  Why is my mon store.db is 220GB?
> > 
> > On Mon, 12 Aug 2013, Jeppesen, Nelson wrote:
> > > Joao,
> > > 
> > > (log file uploaded to http://pastebin.com/Ufrxn6fZ)
> > > 
> > > I had some good luck and some bad luck. I copied the store.db to a new monitor, injected a modified monmap and started it up (This is all on the same host.) Very quickly it reached quorum (as far as I can tell) but didn't respond. Running 'ceph -w' just hung, no timeouts or errors. Same thing when restarting an OSD.
> > > 
> > > The last lines of the log file   '...ms_verify_authorizer..' are from 'ceph -w' attempts.
> > > 
> > > I restarted everything again and it sat there synchronizing. IO stat reported about 100MB/s, but just reads. I let it sit there for 7 min but nothing happened.
> > 
> > Can you do this again with --debug-mon 20 --debug-ms 1?  It looks as though the main dispatch thread is blocked (7f71a1aa5700 does nothing after winning the election).  It would also be helpful to gdb attach to the running ceph-mon and capture the output from 'thread apply all bt'.
> > 
> > > Side question, how long can a ceph cluster run without a monitor? I 
> > > was able to upload files via rados gateway without issue even when 
> > > the monitor was down.
> > 
> > Quite a while, as long as no new processes need to authenticate, and no nodes go up or down.  Eventually the authentication keys are going to time out, though (1 hour is the default).
> > 
> > sage
> > 
> > 
> 
> 
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux