Re: OSD's keep crasching after clusterreboot

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



another update,

we now took the more destructive route and removed the cephfs pools
(lucky we had only test date in the filesystem)
Our hope was that within the startup-process the osd will delete the
no longer needed PG, But this is NOT the Case.

So we are still have the same issue the only difference is that the PG
does not belong to a pool anymore.

 -360> 2019-08-07 14:52:32.655 7fb14db8de00  5 osd.44 pg_epoch: 196586
pg[23.f8s0(unlocked)] enter Initial
 -360> 2019-08-07 14:52:32.659 7fb14db8de00 -1
/build/ceph-13.2.6/src/osd/ECUtil.h: In function
'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
7fb14db8de00 time 2019-08-07 14:52:32.660169
/build/ceph-13.2.6/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
stripe_size == 0)

we now can take one rout and try to delete the pg by hand in the OSD
(bluestore) how this can be done? OR we try to upgrade to Nautilus and
hope for the beset.

any help hints are welcome,
have a nice one
Ansgar

Am Mi., 7. Aug. 2019 um 11:32 Uhr schrieb Ansgar Jazdzewski
<a.jazdzewski@xxxxxxxxxxxxxx>:
>
> Hi,
>
> as a follow-up:
> * a full log of one OSD failing to start https://pastebin.com/T8UQ2rZ6
> * our ec-pool cration in the fist place https://pastebin.com/20cC06Jn
> * ceph osd dump and ceph osd erasure-code-profile get cephfs
> https://pastebin.com/TRLPaWcH
>
> as we try to dig more into it, it looks like a bug in the cephfs or
> erasure-coding part of ceph.
>
> Ansgar
>
>
> Am Di., 6. Aug. 2019 um 14:50 Uhr schrieb Ansgar Jazdzewski
> <a.jazdzewski@xxxxxxxxxxxxxx>:
> >
> > hi folks,
> >
> > we had to move one of our clusters so we had to boot all servers, now
> > we found an Error on all OSD with the EC-Pool.
> >
> > do we miss some opitons, will an upgrade to 13.2.6 help?
> >
> >
> > Thanks,
> > Ansgar
> >
> > 2019-08-06 12:10:16.265 7fb337b83200 -1
> > /build/ceph-13.2.4/src/osd/ECUtil.h: In function
> > 'ECUtil::stripe_info_t::stripe_info_t(uint64_t, uint64_t)' thread
> > 7fb337b83200 time 2019-08-06 12:10:16.263025
> > /build/ceph-13.2.4/src/osd/ECUtil.h: 34: FAILED assert(stripe_width %
> > stripe_size == 0)
> >
> > ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic
> > (stable) 1: (ceph::ceph_assert_fail(char const, char const, int, char
> > const)+0x102) [0x7fb32eeb83c2] 2: (()+0x2e5587) [0x7fb32eeb8587] 3:
> > (ECBackend::ECBackend(PGBackend::Listener, coll_t const&,
> > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore,
> > CephContext, std::shared_ptr<ceph::ErasureCodeInterface>, unsigned
> > long)+0x4de) [0xa4cbbe] 4: (PGBackend::build_pg_backend(pg_pool_t
> > const&, std::map<std::cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >,
> > std::cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >, std::less<std::cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> > >, std
> > ::allocator<std::pair<std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> > const,
> > std::cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> > > > > const&, PGBackend::Listener, coll_t,
> > boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ObjectStore,
> > CephContext)+0x2f9 ) [0x9474e9] 5:
> > (PrimaryLogPG::PrimaryLogPG(OSDService, std::shared_ptr<OSDMap const>,
> > PGPool const&, std::map<std::cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> >,
> > std::cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> >, std::less<std::cxx11::basic_string<char,
> > std::char_tra its<char>, std::allocator<char> > >,
> > std::allocator<std::pair<std::__cxx11::basic_string<char,
> > std::char_traits<char>, std::allocator<char> > const,
> > std::cxx11::basic_string<char, std::char_traits<char>,
> > std::allocator<char> > > > > const&, spg_t)+0x138) [0x8f96e8] 6:
> > (OSD::_make_pg(std::shared_ptr<OSDMap const>, spg_t)+0x11d3)
> > [0x753553] 7: (OSD::load_pgs()+0x4a9) [0x758339] 8:
> > (OSD::init()+0xcd3) [0x7619c3] 9: (main()+0x3678) [0x64d6a8] 10:
> > (libc_start_main()+0xf0) [0x7fb32ca68830] 11: (_start()+0x29)
> > [0x717389] NOTE: a copy of the executable, or objdump -rdS
> > <executable> is needed to interpret this.
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux