RE: OSD crash when one of the PGs have problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



You are saying xfs is throwing errors in the dmesg, so, this could be a reason why ceph is not able to read PG metadata for leveldb->xfs, isn't it ?

Thanks & Regards
Somnath

-----Original Message-----
From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel-owner@xxxxxxxxxxxxxxx] On Behalf Of Varada Kari
Sent: Wednesday, January 20, 2016 8:02 AM
To: Sage Weil
Cc: ceph-devel@xxxxxxxxxxxxxxx
Subject: RE: OSD crash when one of the PGs have problem 

Yes Sage. Applied Somnath's write path changes on top the jewel branch (two weeks old) and jemalloc. Haven't tried with latest changes though. With the same code, we are not able hit this issue again, when retried.  Not sure if we are hitting any race in committing change here.

Varada

> -----Original Message-----
> From: Sage Weil [mailto:sage@xxxxxxxxxxxx]
> Sent: Wednesday, January 20, 2016 9:04 PM
> To: Varada Kari <Varada.Kari@xxxxxxxxxxx>
> Cc: ceph-devel@xxxxxxxxxxxxxxx
> Subject: Re: OSD crash when one of the PGs have problem
> 
> On Wed, 20 Jan 2016, Varada Kari wrote:
> > Hi all,
> >
> >
> > Sorry for the long mail. Wanted to give more information in the mail.
> >
> > We are seeing an issue in our test cluster when an osd is stopped 
> > gracefully
> and restarted after the maintenance.
> > Please note, haven't used any no out option for the cluster.
> > After the reboot, not able to one of the PG, and no drive problems 
> > are
> read errors are logged in dmesg/kernel logs. But we are not able to 
> read the head object of PG for some reason, sorry of lack of logs we 
> are running with debugs with 0/0 and build with jemalloc enabled for 
> testing. Hence a customized version.
> >
> > <snip>
> > 2015-12-27 13:14:39.684307 7f5ae47ff700 -1 osd.1 405 *** Got signal 
> > Terminated ***
> > 2015-12-27 13:14:39.779307 7f5ae47ff700  0 osd.1 405 prepare_to_stop 
> > telling mon we are shutting down
> > 2016-01-04 10:46:13.966959 7fad39d1e800  0 ceph version 
> > 10.0.0(7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5), process ceph-osd, 
> > pid 6975
> > 2016-01-04 10:46:14.007516 7fad39d1e800  0
> > filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342)
> > 2016-01-04 10:46:14.008565 7fad39d1e800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > FIEMAP ioctl is disabled via 'filestore fiemap' config option
> > 2016-01-04 10:46:14.008573 7fad39d1e800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
> > config option
> > 2016-01-04 10:46:14.008601 7fad39d1e800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > splice is supported
> > 2016-01-04 10:46:14.008882 7fad39d1e800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > syncfs(2) syscall fully supported (by glibc and kernel)
> > 2016-01-04 10:46:14.008921 7fad39d1e800  0
> > xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: 
> > extsize is supported and your kernel >= 3.5
> > 2016-01-04 10:46:14.339253 7fad39d1e800  0
> > filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD 
> > journal
> > mode: checkpoint is not enabled
> > 2016-01-04 10:46:14.532510 7fad39d1e800  0 <cls>
> > cls/hello/cls_hello.cc:305: loading cls_hello
> > 2016-01-04 10:46:14.534168 7fad39d1e800  0 <cls>
> > cls/cephfs/cls_cephfs.cc:136: loading cephfs_size_scan
> > 2016-01-04 10:46:14.535626 7fad39d1e800  0 osd.1 405 crush map has 
> > features 1107558400, adjusting msgr requires for clients
> > 2016-01-04 10:46:14.535634 7fad39d1e800  0 osd.1 405 crush map has 
> > features 1107558400 was 8705, adjusting msgr requires for mons
> > 2016-01-04 10:46:14.535639 7fad39d1e800  0 osd.1 405 crush map has 
> > features 1107558400, adjusting msgr requires for osds
> > 2016-01-04 10:46:14.954060 7fad39d1e800  0 osd.1 405 load_pgs
> > 2016-01-04 10:46:15.634892 7fad39d1e800 -1 osd.1 405 load_pgs unable 
> > to peek at 1.29e metadata, skipping
> 
> It looks like the pg's metadata didn't commit properly.  It doesn't 
> find it on startup, skips the pg, and then crashes below when it tries 
> to create it and the collection already exists.
> 
> Are there any changes applied on top of upstream filestore?  Also, is 
> this reproducible on a later version of the code?
> 
> > 2016-01-04 10:46:16.334253 7fad39d1e800  0 osd.1 405 load_pgs opened
> > 158 pgs
> > 2016-01-04 10:46:16.336960 7fad39d1e800 -1 osd.1 405 log_to_monitors 
> > {default=true}
> > 2016-01-04 10:46:16.702999 7fad273be700  0
> > filestore(/var/lib/ceph/osd/ceph-1)  error (17) File exists not 
> > handled on operation 0x7fad2c83f1c0 (201318686.0.0, or op 0, 
> > counting from 0)
> > 2016-01-04 10:46:16.703014 7fad273be700  0
> > filestore(/var/lib/ceph/osd/ceph-1) unexpected error code
> > 2016-01-04 10:46:16.703017 7fad273be700  0
> filestore(/var/lib/ceph/osd/ceph-1)  transaction dump:
> > {
> >     "ops": [
> >         {
> >             "op_num": 0,
> >             "op_name": "mkcoll",
> >             "collection": "1.29e_head"
> >         },
> >         {
> >             "op_num": 1,
> >             "op_name": "coll_hint",
> >             "collection": "1.29e_head",
> >             "type": 1,
> >             "pg_num": 800,
> >             "expected_num_objects": 0
> >         },
> >         {
> >             "op_num": 2,
> >             "op_name": "touch",
> >             "collection": "1.29e_head",
> >             "oid": "1\/0000029e\/\/head"
> >         },
> >         {
> >             "op_num": 3,
> >             "op_name": "omap_setkeys",
> >             "collection": "1.29e_head",
> >             "oid": "1\/0000029e\/\/head",
> >             "attr_lens": {
> >                 "_infover": 1
> >             }
> >         },
> >         {
> >             "op_num": 4,
> >             "op_name": "touch",
> >             "collection": "1.29e_head",
> >             "oid": "1\/0000029e\/\/head"
> >         },
> >         {
> >             "op_num": 5,
> >             "op_name": "omap_setkeys",
> >             "collection": "1.29e_head",
> >             "oid": "1\/0000029e\/\/head",
> >             "attr_lens": {
> >                 "_biginfo": 196,
> >                 "_epoch": 4,
> >                 "_info": 847,
> >                 "can_rollback_to": 12,
> >                 "rollback_info_trimmed_to": 12
> >             }
> >         }
> >     ]
> > }
> >
> > 2016-01-04 10:46:16.731323 7fad273be700 -1 os/FileStore.cc: In 
> > function 'unsigned int 
> > FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, 
> > ThreadPool::TPHandle*)' thread 7fad273be700 time 2016-01-04
> > 10:46:16.703103
> > os/FileStore.cc: 3017: FAILED assert(0 == "unexpected error")
> >
> >  ceph version 10.0.0 (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5)
> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x8b) [0x7fad39846a6b]
> >  2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned 
> > long, int, ThreadPool::TPHandle*)+0xa48) [0x7fad3968bf58]
> >  3: 
> > (FileStore::_do_transactions(std::list<ObjectStore::Transaction*,
> > std::allocator<ObjectStore::Transaction*> >&, unsigned long,
> > ThreadPool::TPHandle*)+0x64) [0x7fad396932a4]
> >  4: (FileStore::_do_op(FileStore::OpSequencer*,
> > ThreadPool::TPHandle&)+0x2c3) [0x7fad39693583]
> >  5: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56)
> > [0x7fad398384e6]
> >  6: (ThreadPool::WorkThread::entry()+0x10) [0x7fad398393b0]
> >  7: (()+0x8182) [0x7fad38609182]
> >  8: (clone()+0x6d) [0x7fad367fc47d]
> > <snip>
> >
> >  On the reboot, because of the journal reply, we didn't see any 
> > transaction
> dump issues this time, but we are seeing the following problem.
> >
> >  <snip>
> > 2016-01-04 10:47:09.302400 7f8190a3c800  0 ceph version 10.0.0 
> > (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5), process ceph-osd, pid
> 9488
> > 2016-01-04 10:47:09.310055 7f8190a3c800  0
> > filestore(/var/lib/ceph/osd/ceph-1) backend xfs (magic 0x58465342)
> > 2016-01-04 10:47:09.310379 7f8190a3c800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > FIEMAP ioctl is disabled via 'filestore fiemap' config option
> > 2016-01-04 10:47:09.310384 7f8190a3c800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' 
> > config option
> > 2016-01-04 10:47:09.310398 7f8190a3c800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > splice is supported
> > 2016-01-04 10:47:09.310869 7f8190a3c800  0
> > genericfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features:
> > syncfs(2) syscall fully supported (by glibc and kernel)
> > 2016-01-04 10:47:09.310935 7f8190a3c800  0
> > xfsfilestorebackend(/var/lib/ceph/osd/ceph-1) detect_features: 
> > extsize is supported and your kernel >= 3.5
> > 2016-01-04 10:47:09.322795 7f8190a3c800  0
> > filestore(/var/lib/ceph/osd/ceph-1) mount: enabling WRITEAHEAD 
> > journal
> > mode: checkpoint is not enabled
> > 2016-01-04 10:47:09.322828 7f8190a3c800  0
> > filestore(/var/lib/ceph/osd/ceph-1) disabling wbthrottle( 
> > filestore_wbthrottle_enable = 0 ) since do_fast_sync is enabled
> > 2016-01-04 10:47:09.327861 7f8183ee0700  0
> > filestore(/var/lib/ceph/osd/ceph-1) ## Fast sync is enabled ##
> > 2016-01-04 10:47:09.392333 7f8190a3c800  0 <cls>
> > cls/hello/cls_hello.cc:305: loading cls_hello
> > 2016-01-04 10:47:09.393968 7f8190a3c800  0 <cls>
> > cls/cephfs/cls_cephfs.cc:136: loading cephfs_size_scan
> > 2016-01-04 10:47:09.394528 7f8190a3c800  0 osd.1 410 crush map has 
> > features 1107558400, adjusting msgr requires for clients
> > 2016-01-04 10:47:09.394536 7f8190a3c800  0 osd.1 410 crush map has 
> > features 1107558400 was 8705, adjusting msgr requires for mons
> > 2016-01-04 10:47:09.394552 7f8190a3c800  0 osd.1 410 crush map has 
> > features 1107558400, adjusting msgr requires for osds
> > 2016-01-04 10:47:09.458215 7f8190a3c800  0 osd.1 410 load_pgs
> > 2016-01-04 10:47:10.935841 7f8190a3c800  0 osd.1 410 load_pgs opened
> > 159 pgs
> > 2016-01-04 10:47:10.936696 7f8190a3c800 -1 osd.1 410 log_to_monitors 
> > {default=true}
> > 2016-01-04 10:47:13.942139 7f8190a3c800  0 osd.1 410 done with init, 
> > starting boot process
> > 2016-01-04 10:47:14.117463 7f817d1f3700  0 -- 
> > 10.242.43.141:6825/9488
> > >> 10.242.43.143:6821/40041 pipe(0x7f8141c0e000 sd=83 :6825 s=0 
> > >> pgs=0
> > cs=0 l=0 c=0x7f8141c14180).accept connect_seq 0 vs existing 0 state 
> > connecting
> > 2016-01-04 10:47:15.649791 7f812eefe700  0 -- 
> > 10.242.43.141:6825/9488
> > >> 10.242.43.141:6801/6106 pipe(0x7f8141c0e000 sd=93 :6825 s=0 pgs=0
> > cs=0 l=0 c=0x7f8141c14480).accept connect_seq 0 vs existing 0 state 
> > connecting
> > 2016-01-04 10:47:15.649826 7f812edfd700  0 -- 
> > 10.242.43.141:6825/9488
> > >> 10.242.43.141:6821/6925 pipe(0x7f8141c1c000 sd=137 :6825 s=0 
> > >> pgs=0
> > cs=0 l=0 c=0x7f8141c14780).accept connect_seq 0 vs existing 0 state 
> > connecting
> > 2016-01-04 10:47:15.924059 7f81527ed700 -1 osd/PGLog.h: In function 
> > 'void PGLog::IndexedLog::claim_log_and_clear_rollback_info(const
> > pg_log_t&)' thread 7f81527ed700 time 2016-01-04 10:47:15.906256
> > osd/PGLog.h: 89: FAILED assert(rollback_info_trimmed_to_riter ==
> > log.rbegin())
> >
> >  ceph version 10.0.0 (7eb76e73fd5b6e990ceba9d45e5a1fec92145fa5)
> >  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
> > const*)+0x8b) [0x7f8190564a6b]
> >  2: (PG::RecoveryState::Stray::react(PG::MLogRec const&)+0xa21) 
> > [0x7f8190173971]
> >  3: (boost::statechart::simple_state<PG::RecoveryState::Stray,
> > PG::RecoveryState::Started, boost::mpl::list<mpl_::na, mpl_::na, 
> > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
> > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
> > mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, 
> > (boost::statechart::history_mode)0>::react_impl(boost::statechart::e
> > ve nt_base const&, void const*)+0x1f4) [0x7f81901ad584]
> >  4:
> > (boost::statechart::state_machine<PG::RecoveryState::RecoveryMachine
> > , PG::RecoveryState::Initial, std::allocator<void>, 
> > boost::statechart::null_exception_translator>::send_event(boost::sta
> > te chart::event_base const&)+0x5b) [0x7f8190197bab]
> >  5: (PG::handle_peering_event(std::shared_ptr<PG::CephPeeringEvt>,
> > PG::RecoveryCtx*)+0x1ce) [0x7f81901412ce]
> >  6: (OSD::process_peering_events(std::list<PG*, std::allocator<PG*> 
> > > const&, ThreadPool::TPHandle&)+0x267) [0x7f81900cb677]
> >  7: (OSD::PeeringWQ::_process(std::list<PG*, std::allocator<PG*> > 
> > const&, ThreadPool::TPHandle&)+0x18) [0x7f819010fed8]
> >  8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa56)
> > [0x7f81905564e6]
> >  9: (ThreadPool::WorkThread::entry()+0x10) [0x7f81905573b0]
> >  10: (()+0x8182) [0x7f818f327182]
> >  11: (clone()+0x6d) [0x7f818d51a47d]
> >  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
> > needed
> to interpret this.
> > <snip>
> >
> >  Whenever we restart the osd, we are hitting the problem and not 
> > able to
> get the osd up. Tried pg repair, which bails saying primary osd down.
> >  Are there any repair techniques available to get this corrected?
> 
> Most likely deleting the pg directory will be sufficient to clear it up.
> 
> sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux