This is the same osd, and hasn't been working in the mean time? Can your clsuter operate without that osd? -Sam On Mon, Aug 19, 2013 at 2:05 PM, Olivier Bonvalet <ceph.list@xxxxxxxxx> wrote: > Le lundi 19 août 2013 à 12:27 +0200, Olivier Bonvalet a écrit : >> Hi, >> >> I have an OSD which crash every time I try to start it (see logs below). >> Is it a known problem ? And is there a way to fix it ? >> >> root! taman:/var/log/ceph# grep -v ' pipe' osd.65.log >> 2013-08-19 11:07:48.478558 7f6fe367a780 0 ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff), process ceph-osd, pid 19327 >> 2013-08-19 11:07:48.516363 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount FIEMAP ioctl is supported and appears to work >> 2013-08-19 11:07:48.516380 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option >> 2013-08-19 11:07:48.516514 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount did NOT detect btrfs >> 2013-08-19 11:07:48.517087 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount syscall(SYS_syncfs, fd) fully supported >> 2013-08-19 11:07:48.517389 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount found snaps <> >> 2013-08-19 11:07:49.199483 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount: enabling WRITEAHEAD journal mode: btrfs not detected >> 2013-08-19 11:07:52.191336 7f6fe367a780 1 journal _open /dev/sdk4 fd 18: 53687091200 bytes, block size 4096 bytes, directio = 1, aio = 1 >> 2013-08-19 11:07:52.196020 7f6fe367a780 1 journal _open /dev/sdk4 fd 18: 53687091200 bytes, block size 4096 bytes, directio = 1, aio = 1 >> 2013-08-19 11:07:52.196920 7f6fe367a780 1 journal close /dev/sdk4 >> 2013-08-19 11:07:52.199908 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount FIEMAP ioctl is supported and appears to work >> 2013-08-19 11:07:52.199916 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount FIEMAP ioctl is disabled via 'filestore fiemap' config option >> 2013-08-19 11:07:52.200058 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount did NOT detect btrfs >> 2013-08-19 11:07:52.200886 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount syscall(SYS_syncfs, fd) fully supported >> 2013-08-19 11:07:52.200919 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount found snaps <> >> 2013-08-19 11:07:52.215850 7f6fe367a780 0 filestore(/var/lib/ceph/osd/ceph-65) mount: enabling WRITEAHEAD journal mode: btrfs not detected >> 2013-08-19 11:07:52.219819 7f6fe367a780 1 journal _open /dev/sdk4 fd 26: 53687091200 bytes, block size 4096 bytes, directio = 1, aio = 1 >> 2013-08-19 11:07:52.227420 7f6fe367a780 1 journal _open /dev/sdk4 fd 26: 53687091200 bytes, block size 4096 bytes, directio = 1, aio = 1 >> 2013-08-19 11:07:52.500342 7f6fe367a780 0 osd.65 144201 crush map has features 262144, adjusting msgr requires for clients >> 2013-08-19 11:07:52.500353 7f6fe367a780 0 osd.65 144201 crush map has features 262144, adjusting msgr requires for osds >> 2013-08-19 11:08:13.581709 7f6fbdcb5700 -1 osd/OSD.cc: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7f6fbdcb5700 time 2013-08-19 11:08:13.579519 >> osd/OSD.cc: 4844: FAILED assert(_get_map_bl(epoch, bl)) >> >> ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff) >> 1: (OSDService::get_map(unsigned int)+0x44b) [0x6f5b9b] >> 2: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >*)+0x3c8) [0x6f8f48] >> 3: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x31f) [0x6f975f] >> 4: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x14) [0x7391d4] >> 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68a) [0x8f8e3a] >> 6: (ThreadPool::WorkThread::entry()+0x10) [0x8fa0e0] >> 7: (()+0x6b50) [0x7f6fe3070b50] >> 8: (clone()+0x6d) [0x7f6fe15cba7d] >> NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. >> >> full logs here : http://pastebin.com/RphNyLU0 >> >> > > Hi, > > still same problem with Ceph 0.61.8 : > > 2013-08-19 23:01:54.369609 7fdd667a4780 0 osd.65 144279 crush map has features 262144, adjusting msgr requires for osds > 2013-08-19 23:01:58.315115 7fdd405de700 -1 osd/OSD.cc: In function 'OSDMapRef OSDService::get_map(epoch_t)' thread 7fdd405de700 time 2013-08-19 23:01:58.313955 > osd/OSD.cc: 4847: FAILED assert(_get_map_bl(epoch, bl)) > > ceph version 0.61.8 (a6fdcca3bddbc9f177e4e2bf0d9cdd85006b028b) > 1: (OSDService::get_map(unsigned int)+0x44b) [0x6f736b] > 2: (OSD::advance_pg(unsigned int, PG*, ThreadPool::TPHandle&, PG::RecoveryCtx*, std::set<boost::intrusive_ptr<PG>, std::less<boost::intrusive_ptr<PG> >, std::allocator<boost::intrusive_ptr<PG> > >*)+0x3c8) [0x6fa708] > 3: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x31f) [0x6faf1f] > 4: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > const&, ThreadPool::TPHandle&)+0x14) [0x73a9b4] > 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0x68a) [0x8fb69a] > 6: (ThreadPool::WorkThread::entry()+0x10) [0x8fc940] > 7: (()+0x6b50) [0x7fdd6619ab50] > 8: (clone()+0x6d) [0x7fdd646f5a7d] > NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. > > > (It's on Debian Wheezy, with a 3.10.5 kernel) > > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com