Re: How to migrate from a "missing auth" monitor files to a regular one?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thank you, after apply 25 times 'ceph auth add mon.a', unpatched version works.

Here's the details step:
1. stop cluster(mon,osd and mds), backup current /var/lib/ceph/mon/ceph-a dir
2. start patched ceph-mon and ceph-osd(i am not sure ceph-osd is necessary or not)
3. run 'ceph auth add mon.a' 25 times.
4. stop ceph-mon and ceph.osd, and run unpatched ceph-mon with command 'ceph-mon -i a -f', and it works.
5. stop ceph-mon, backup current ok /var/lib/ceph/mon/ceph-a dir,
6. revert back to the /var/lib/ceph/mon/ceph-a that save on step 1, and run unpatched ceph-mon again,
ensure that ceph-mon is not start with this version of files(throw errors).
7. switch back to save dir on step 5.



On Mon, Aug 26, 2013 at 12:16 AM, Sage Weil <sage@xxxxxxxxxxx> wrote:
On Sun, 25 Aug 2013, Yu Changyuan wrote:
> Today, when I restart ceph service, the problem I asked on mail-list before
> happened
> again(http://article.gmane.org/gmane.comp.file-systems.ceph.user/2995), 
> ceph-mon refuse to start and report below error:
>
> 2013-08-25 18:24:52.465600 7fb50a496780 -1 mon/AuthMonitor.cc: In function
> 'virtual void AuthMonitor::update_from_paxos(bool*)' thread 7fb50a496780
> time 2013-08-25 18:24:52.453920
> mon/AuthMonitor.cc: 152: FAILED assert(ret == 0)
>
>  ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff)
>  1: (AuthMonitor::update_from_paxos(bool*)+0x1fee) [0x57742e]
>  2: (PaxosService::refresh(bool*)+0x18d) [0x4f630d]
>  3: (Monitor::refresh_from_paxos(bool*)+0x57) [0x496477]
>  4: (Monitor::init_paxos()+0xf5) [0x496635]
>  5: (Monitor::preinit()+0x6bc) [0x4ad1dc]
>  6: (main()+0x1bec) [0x48ac8c]
>  7: (__libc_start_main()+0xed) [0x7fb5084c660d]
>  8: ceph-mon() [0x48dab9]
>
> Then, I switch to ''wip-mon-skip-auth-cuttlefish" branch, ceph-mon complain
> some "missing auth inc"(from 1 to 500), and continue running, then
> everything is ok again.
>
> But when I stop this patched ceph-mon, and try to start regular unpatched
> ceph-mon, above error happened again. As I mentioned, the ceph-mon files
> last time I use is not the final one that 'missing auth', but the files 2
> days before ceph-mon fail, which actually ceph-mon start ok but ceph-osd
> refuse to work.
>
> So, I want to know how to make these ceph-mon files that only work with
> patched ceph-mon to work again with regular unpatched ceph-mon.

Without seeing logs and knowing exactly what is going on, my first guess
is that running several 'ceph auth add' or 'ceph auth import' commands
that makes modifications to the auth db 25 times will get you past the
gap.  After that, the mon should start with the unpatched version.

If that doesn't fix it, can you generate a log with 'debug ms = 1' 'debug
paxos = 20' 'debug mon = 20' and share that?

Thanks-
sage



--
Best regards,
Changyuan
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux