Re: after loss of journal, osd fails to start with failed assert OSDMapRef OSDService::get_map(epoch_t) ret != null

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The rule of thumb is that the data on OSD is gone if the related journal is gone.
Journal doesn't just "vanish", though, so you should investigate further...

This log is from the new empty journal, right?

Jan

> On 08 Dec 2015, at 08:08, Benedikt Fraunhofer <fraunhofer@xxxxxxxxxx> wrote:
> 
> Hello List,
> 
> after some crash of a box, the journal vanished. Creating a new one
> with --mkjournal results in the osd beeing unable to start.
> Does anyone want to dissect this any further or should I just trash
> the osd and recreate it?
> 
> Thx in advance
>  Benedikt
> 
> 2015-12-01 07:46:31.505255 7fadb7f1e900  0 ceph version 0.94.5
> (9764da52395923e0b32908d83a9f7304401fee43), process ceph-osd, pid 5486
> 2015-12-01 07:46:31.628585 7fadb7f1e900  0
> filestore(/var/lib/ceph/osd/ceph-328) backend xfs (magic 0x58465342)
> 2015-12-01 07:46:31.662972 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> FIEMAP ioctl is supported and appears to work
> 2015-12-01 07:46:31.662984 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> FIEMAP ioctl is disabled via 'filestore fiemap' config option
> 2015-12-01 07:46:31.674999 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> syncfs(2) syscall fully supported (by glibc and kernel)
> 2015-12-01 07:46:31.675071 7fadb7f1e900  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_feature:
> extsize is supported and kernel 3.19.0-33-generic >= 3.5
> 2015-12-01 07:46:31.806490 7fadb7f1e900  0
> filestore(/var/lib/ceph/osd/ceph-328) mount: enabling WRITEAHEAD
> journal mode: checkpoint is not enabled
> 2015-12-01 07:46:35.598698 7fadb7f1e900  1 journal _open
> /var/lib/ceph/osd/ceph-328/journal fd 19: 9663676416 bytes, block size
> 4096 bytes, directio = 1, aio = 1
> 2015-12-01 07:46:35.600956 7fadb7f1e900  1 journal _open
> /var/lib/ceph/osd/ceph-328/journal fd 19: 9663676416 bytes, block size
> 4096 bytes, directio = 1, aio = 1
> 2015-12-01 07:46:35.619860 7fadb7f1e900  0 <cls>
> cls/hello/cls_hello.cc:271: loading cls_hello
> 2015-12-01 07:46:35.682532 7fadb7f1e900 -1 osd/OSD.h: In function
> 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fadb7f1e900 time
> 2015-12-01 07:46:35.681204
> osd/OSD.h: 716: FAILED assert(ret)
> 
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x8b) [0xbc60eb]
> 2: (OSDService::get_map(unsigned int)+0x3f) [0x70ad5f]
> 3: (OSD::init()+0x6ad) [0x6c5e0d]
> 4: (main()+0x2860) [0x6527e0]
> 5: (__libc_start_main()+0xf5) [0x7fadb505bec5]
> 6: /usr/bin/ceph-osd() [0x66b887]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> --- begin dump of recent events ---
>   -62> 2015-12-01 07:46:31.503728 7fadb7f1e900  5 asok(0x5402000)
> register_command perfcounters_dump hook 0x53a2050
>   -61> 2015-12-01 07:46:31.503759 7fadb7f1e900  5 asok(0x5402000)
> register_command 1 hook 0x53a2050
>   -60> 2015-12-01 07:46:31.503764 7fadb7f1e900  5 asok(0x5402000)
> register_command perf dump hook 0x53a2050
>   -59> 2015-12-01 07:46:31.503768 7fadb7f1e900  5 asok(0x5402000)
> register_command perfcounters_schema hook 0x53a2050
>   -58> 2015-12-01 07:46:31.503772 7fadb7f1e900  5 asok(0x5402000)
> register_command 2 hook 0x53a2050
>   -57> 2015-12-01 07:46:31.503775 7fadb7f1e900  5 asok(0x5402000)
> register_command perf schema hook 0x53a2050
>   -56> 2015-12-01 07:46:31.503786 7fadb7f1e900  5 asok(0x5402000)
> register_command perf reset hook 0x53a2050
>   -55> 2015-12-01 07:46:31.503790 7fadb7f1e900  5 asok(0x5402000)
> register_command config show hook 0x53a2050
>   -54> 2015-12-01 07:46:31.503792 7fadb7f1e900  5 asok(0x5402000)
> register_command config set hook 0x53a2050
>   -53> 2015-12-01 07:46:31.503797 7fadb7f1e900  5 asok(0x5402000)
> register_command config get hook 0x53a2050
>   -52> 2015-12-01 07:46:31.503799 7fadb7f1e900  5 asok(0x5402000)
> register_command config diff hook 0x53a2050
>   -51> 2015-12-01 07:46:31.503802 7fadb7f1e900  5 asok(0x5402000)
> register_command log flush hook 0x53a2050
>   -50> 2015-12-01 07:46:31.503804 7fadb7f1e900  5 asok(0x5402000)
> register_command log dump hook 0x53a2050
>   -49> 2015-12-01 07:46:31.503807 7fadb7f1e900  5 asok(0x5402000)
> register_command log reopen hook 0x53a2050
>   -48> 2015-12-01 07:46:31.505255 7fadb7f1e900  0 ceph version 0.94.5
> (9764da52395923e0b32908d83a9f7304401fee43), process ceph-osd, pid 5486
>   -47> 2015-12-01 07:46:31.619430 7fadb7f1e900  1 -- 10.9.246.104:0/0
> learned my addr 10.9.246.104:0/0
>   -46> 2015-12-01 07:46:31.619439 7fadb7f1e900  1
> accepter.accepter.bind my_inst.addr is 10.9.246.104:6821/5486
> need_addr=0
>   -45> 2015-12-01 07:46:31.619457 7fadb7f1e900  1
> accepter.accepter.bind my_inst.addr is 0.0.0.0:6824/5486 need_addr=1
>   -44> 2015-12-01 07:46:31.619473 7fadb7f1e900  1
> accepter.accepter.bind my_inst.addr is 0.0.0.0:6825/5486 need_addr=1
>   -43> 2015-12-01 07:46:31.619492 7fadb7f1e900  1 -- 10.9.246.104:0/0
> learned my addr 10.9.246.104:0/0
>   -42> 2015-12-01 07:46:31.619496 7fadb7f1e900  1
> accepter.accepter.bind my_inst.addr is 10.9.246.104:6827/5486
> need_addr=0
>   -41> 2015-12-01 07:46:31.620890 7fadb7f1e900  5 asok(0x5402000)
> init /var/run/ceph/ceph-osd.328.asok
>   -40> 2015-12-01 07:46:31.620901 7fadb7f1e900  5 asok(0x5402000)
> bind_and_listen /var/run/ceph/ceph-osd.328.asok
>   -39> 2015-12-01 07:46:31.620931 7fadb7f1e900  5 asok(0x5402000)
> register_command 0 hook 0x539e0b0
>   -38> 2015-12-01 07:46:31.620937 7fadb7f1e900  5 asok(0x5402000)
> register_command version hook 0x539e0b0
>   -37> 2015-12-01 07:46:31.620946 7fadb7f1e900  5 asok(0x5402000)
> register_command git_version hook 0x539e0b0
>   -36> 2015-12-01 07:46:31.620954 7fadb7f1e900  5 asok(0x5402000)
> register_command help hook 0x53a2140
>   -35> 2015-12-01 07:46:31.620962 7fadb7f1e900  5 asok(0x5402000)
> register_command get_command_descriptions hook 0x53a2130
>   -34> 2015-12-01 07:46:31.621012 7fadb7f1e900 10 monclient(hunting):
> build_initial_monmap
>   -33> 2015-12-01 07:46:31.621014 7fadb2024700  5 asok(0x5402000) entry start
>   -32> 2015-12-01 07:46:31.627732 7fadb7f1e900  5 adding auth protocol: cephx
>   -31> 2015-12-01 07:46:31.627740 7fadb7f1e900  5 adding auth protocol: cephx
> 
>   -30> 2015-12-01 07:46:31.627832 7fadb7f1e900  5 asok(0x5402000)
> register_command objecter_requests hook 0x53a2170
>   -29> 2015-12-01 07:46:31.627894 7fadb7f1e900  1 --
> 10.9.246.104:6821/5486 messenger.start
>   -28> 2015-12-01 07:46:31.627923 7fadb7f1e900  1 -- :/0 messenger.start
>   -27> 2015-12-01 07:46:31.627947 7fadb7f1e900  1 --
> 10.9.246.104:6827/5486 messenger.start
>   -26> 2015-12-01 07:46:31.627977 7fadb7f1e900  1 --
> 0.0.0.0:6825/5486 messenger.start
>   -25> 2015-12-01 07:46:31.628002 7fadb7f1e900  1 --
> 0.0.0.0:6824/5486 messenger.start
>   -24> 2015-12-01 07:46:31.628026 7fadb7f1e900  1 -- :/0 messenger.start
>   -23> 2015-12-01 07:46:31.628075 7fadb7f1e900  2 osd.328 0 mounting
> /var/lib/ceph/osd/ceph-328 /var/lib/ceph/osd/ceph-328/journal
>   -22> 2015-12-01 07:46:31.628585 7fadb7f1e900  0
> filestore(/var/lib/ceph/osd/ceph-328) backend xfs (magic 0x58465342)
>   -21> 2015-12-01 07:46:31.662972 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> FIEMAP ioctl is supported and appears to work
>   -20> 2015-12-01 07:46:31.662984 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> FIEMAP ioctl is disabled via 'filestore fiemap' config option
>   -19> 2015-12-01 07:46:31.674999 7fadb7f1e900  0
> genericfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_features:
> syncfs(2) syscall fully supported (by glibc and kernel)
>   -18> 2015-12-01 07:46:31.675071 7fadb7f1e900  0
> xfsfilestorebackend(/var/lib/ceph/osd/ceph-328) detect_feature:
> extsize is supported and kernel 3.19.0-33-generic >= 3.5
>   -17> 2015-12-01 07:46:31.806490 7fadb7f1e900  0
> filestore(/var/lib/ceph/osd/ceph-328) mount: enabling WRITEAHEAD
> journal mode: checkpoint is not enabled
>   -16> 2015-12-01 07:46:35.596667 7fadb7f1e900  2 journal open
> /var/lib/ceph/osd/ceph-328/journal fsid
> b054d6cb-8b2a-41aa-ac12-1169b42d036d fs_op_seq 2160370
>   -15> 2015-12-01 07:46:35.598698 7fadb7f1e900  1 journal _open
> /var/lib/ceph/osd/ceph-328/journal fd 19: 9663676416 bytes, block size
> 4096 bytes, directio = 1, aio = 1
>   -14> 2015-12-01 07:46:35.598912 7fadb7f1e900  2 journal open
> advancing committed_seq 0 to fs op_seq 2160370
>   -13> 2015-12-01 07:46:35.598947 7fadb7f1e900  2 journal No further
> valid entries found, journal is most likely valid
>   -12> 2015-12-01 07:46:35.598954 7fadb7f1e900  2 journal No further
> valid entries found, journal is most likely valid
>   -11> 2015-12-01 07:46:35.598956 7fadb7f1e900  3 journal
> journal_replay: end of journal, done.
>   -10> 2015-12-01 07:46:35.600956 7fadb7f1e900  1 journal _open
> /var/lib/ceph/osd/ceph-328/journal fd 19: 9663676416 bytes, block size
> 4096 bytes, directio = 1, aio = 1
>    -9> 2015-12-01 07:46:35.601366 7fadb7f1e900  2 osd.328 0 boot
>    -8> 2015-12-01 07:46:35.615322 7fadb7f1e900  1 <cls>
> cls/replica_log/cls_replica_log.cc:141: Loaded replica log class!
>    -7> 2015-12-01 07:46:35.615507 7fadb7f1e900  1 <cls>
> cls/statelog/cls_statelog.cc:306: Loaded log class!
>    -6> 2015-12-01 07:46:35.617645 7fadb7f1e900  1 <cls>
> cls/rgw/cls_rgw.cc:3047: Loaded rgw class!
>    -5> 2015-12-01 07:46:35.619520 7fadb7f1e900  1 <cls>
> cls/refcount/cls_refcount.cc:231: Loaded refcount class!
>    -4> 2015-12-01 07:46:35.619596 7fadb7f1e900  1 <cls>
> cls/user/cls_user.cc:367: Loaded user class!
>    -3> 2015-12-01 07:46:35.619671 7fadb7f1e900  1 <cls>
> cls/log/cls_log.cc:312: Loaded log class!
>    -2> 2015-12-01 07:46:35.619794 7fadb7f1e900  1 <cls>
> cls/version/cls_version.cc:227: Loaded version class!
>    -1> 2015-12-01 07:46:35.619860 7fadb7f1e900  0 <cls>
> cls/hello/cls_hello.cc:271: loading cls_hello
>     0> 2015-12-01 07:46:35.682532 7fadb7f1e900 -1 osd/OSD.h: In
> function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fadb7f1e900
> time 2015-12-01 07:46:35.681204
> osd/OSD.h: 716: FAILED assert(ret)
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x8b) [0xbc60eb]
> 2: (OSDService::get_map(unsigned int)+0x3f) [0x70ad5f]
> 3: (OSD::init()+0x6ad) [0x6c5e0d]
> 4: (main()+0x2860) [0x6527e0]
> 5: (__libc_start_main()+0xf5) [0x7fadb505bec5]
> 6: /usr/bin/ceph-osd() [0x66b887]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> --- logging levels ---
>   0/ 5 none
>   0/ 1 lockdep
>   0/ 1 context
>   1/ 1 crush
>   1/ 5 mds
>   1/ 5 mds_balancer
>   1/ 5 mds_locker
>   1/ 5 mds_log
>   1/ 5 mds_log_expire
>   1/ 5 mds_migrator
>   0/ 1 buffer
>   0/ 1 timer
>   0/ 1 filer
>   0/ 1 striper
>   0/ 1 objecter
>   0/ 5 rados
>   0/ 5 rbd
>   0/ 5 rbd_replay
>   0/ 5 journaler
>   0/ 5 objectcacher
>   0/ 5 client
>   0/ 5 osd
>   0/ 5 optracker
> [...]
> --- end dump of recent events ---
> 2015-12-01 07:46:35.684846 7fadb7f1e900 -1 *** Caught signal (Aborted) **
> in thread 7fadb7f1e900
> 
> ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> 1: /usr/bin/ceph-osd() [0xacd7ba]
> 2: (()+0x10340) [0x7fadb6bd1340]
> 3: (gsignal()+0x39) [0x7fadb5070cc9]
> 4: (abort()+0x148) [0x7fadb50740d8]
> 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fadb597b535]
> 6: (()+0x5e6d6) [0x7fadb59796d6]
> 7: (()+0x5e703) [0x7fadb5979703]
> 8: (()+0x5e922) [0x7fadb5979922]
> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> const*)+0x278) [0xbc62d8]
> 10: (OSDService::get_map(unsigned int)+0x3f) [0x70ad5f]
> 11: (OSD::init()+0x6ad) [0x6c5e0d]
> 12: (main()+0x2860) [0x6527e0]
> 13: (__libc_start_main()+0xf5) [0x7fadb505bec5]
> 14: /usr/bin/ceph-osd() [0x66b887]
> NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> needed to interpret this.
> 
> --- begin dump of recent events ---
>     0> 2015-12-01 07:46:35.684846 7fadb7f1e900 -1 *** Caught signal
> (Aborted) **
> in thread 7fadb7f1e900
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux