Re: Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 15, 2019 at 2:09 AM Troy Ablan <tablan@xxxxxxxxx> wrote:
>
> Paul,
>
> Thanks for the reply.  All of these seemed to fail except for pulling
> the osdmap from the live cluster.
>
> -Troy
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap45
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>    what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f945ee04f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f94531935d0]
>   2: (gsignal()+0x37) [0x7f9451d80207]
>   3: (abort()+0x148) [0x7f9451d818f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f945268f7d5]
>   5: (()+0x5e746) [0x7f945268d746]
>   6: (()+0x5e773) [0x7f945268d773]
>   7: (__cxa_rethrow()+0x49) [0x7f945268d9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f94553218d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f94550ff4ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9455101db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55de1f9a6e60]
>   12: (main()+0x5340) [0x55de1f8c8870]
>   13: (__libc_start_main()+0xf5) [0x7f9451d6c3d5]
>   14: (()+0x3adc10) [0x55de1f9a1c10]
> Aborted
>
> -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap46
> terminate called after throwing an instance of
> 'ceph::buffer::malformed_input'
>    what():  buffer::malformed_input: unsupported bucket algorithm: -1
> *** Caught signal (Aborted) **
>   in thread 7f9ce4135f00 thread_name:ceph-objectstor
>   ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic
> (stable)
>   1: (()+0xf5d0) [0x7f9cd84c45d0]
>   2: (gsignal()+0x37) [0x7f9cd70b1207]
>   3: (abort()+0x148) [0x7f9cd70b28f8]
>   4: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7f9cd79c07d5]
>   5: (()+0x5e746) [0x7f9cd79be746]
>   6: (()+0x5e773) [0x7f9cd79be773]
>   7: (__cxa_rethrow()+0x49) [0x7f9cd79be9e9]
>   8: (CrushWrapper::decode(ceph::buffer::list::iterator&)+0x18b8)
> [0x7f9cda6528d8]
>   9: (OSDMap::decode(ceph::buffer::list::iterator&)+0x4ad) [0x7f9cda4304ad]
>   10: (OSDMap::decode(ceph::buffer::list&)+0x31) [0x7f9cda432db1]
>   11: (get_osdmap(ObjectStore*, unsigned int, OSDMap&,
> ceph::buffer::list&)+0x1d0) [0x55cea26c8e60]
>   12: (main()+0x5340) [0x55cea25ea870]
>   13: (__libc_start_main()+0xf5) [0x7f9cd709d3d5]
>   14: (()+0x3adc10) [0x55cea26c3c10]
> Aborted
>
> -[~:#]- ceph osd getmap -o osdmap
> got osdmap epoch 81298
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-46/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.
>
> -[~:#]- ceph-objectstore-tool --op set-osdmap --data-path
> /var/lib/ceph/osd/ceph-45/ --file osdmap
> osdmap (#-1:92f679f2:::osdmap.81298:0#) does not exist.

819   auto ch = store->open_collection(coll_t::meta());
 820   const ghobject_t full_oid = OSD::get_osdmap_pobject_name(e);
 821   if (!store->exists(ch, full_oid)) {
 822     cerr << "osdmap (" << full_oid << ") does not exist." << std::endl;
 823     if (!force) {
 824       return -ENOENT;
 825     }
 826     cout << "Creating a new epoch." << std::endl;
 827   }

Adding "--force"should get you past that error.

>
>
>
> On 8/14/19 2:54 AM, Paul Emmerich wrote:
> > Starting point to debug/fix this would be to extract the osdmap from
> > one of the dead OSDs:
> >
> > ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/...
> >
> > Then try to run osdmaptool on that osdmap to see if it also crashes,
> > set some --debug options (don't know which one off the top of my
> > head).
> > Does it also crash? How does it differ from the map retrieved with
> > "ceph osd getmap"?
> >
> > You can also set the osdmap with "--op set-osdmap", does it help to
> > set the osdmap retrieved by "ceph osd getmap"?
> >
> > Paul
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux