Re: OSD assert fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 6 Sep 2010, Sage Weil wrote:
> Hi,
> 
> This is one we've seen before, issue #326
> 
> 	http://tracker.newdream.net/issues/326
> 
> Was that the first (and only?) osd to fail?

I mean, those two osds... were they they first failures you saw?

Thanks!


> 
> What kind of workload were you subjecting the cluster to?  Just the file 
> system?  RBD?  Anything unusual?
> 
> Also, can you confirm what version of the code you were running?  The osd 
> log at /var/log/ceph/osd.*.log should have a version number and sha1 id, 
> something like
> 
> ceph version 0.22~rc (3cd9d853cd58c79dc12427be8488e57970abda04)
> 
> Thanks!
> sage
> 
> 
> On Mon, 6 Sep 2010, Leander Yu wrote:
> 
> > Hi all,
> > I have setup a 10 osd + 2 mds + 3 mon ceph cluster. it runs ok at
> > beginning. However after one day, some of the osd  crashed with
> > following assert fail
> > I am using the unstable trunk. ceph.conf is attached.
> > 
> > -------------- osd 3 -----------------
> > osd/PG.h: In function 'void PG::IndexedLog::index(PG::Log::Entry&)':
> > osd/PG.h:429: FAILED assert(caller_ops.count(e.reqid) == 0)
> >  1: (OSD::_process_pg_info(unsigned int, int, PG::Info&, PG::Log&,
> > PG::Missing&, std::map<int, MOSDPGInfo*, std::less<int>,
> > std::allocator<std::pair<int const, MOSDPGInfo*> > >*, int&)+0xb06)
> > [0x4cf426]
> >  2: (OSD::handle_pg_log(MOSDPGLog*)+0xa9) [0x4cf999]
> >  3: (OSD::_dispatch(Message*)+0x3ed) [0x4e7dfd]
> >  4: (OSD::ms_dispatch(Message*)+0x39) [0x4e86c9]
> >  5: (SimpleMessenger::dispatch_entry()+0x789) [0x46b5f9]
> >  6: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x45849c]
> >  7: (Thread::_entry_func(void*)+0xa) [0x46c0ca]
> >  8: (()+0x6a3a) [0x7f69fd39ea3a]
> >  9: (clone()+0x6d) [0x7f69fc5bc77d]
> > 
> > -------------- osd 7 --------------------
> > osd/ReplicatedPG.cc: In function 'void ReplicatedPG::sub_op_pull(MOSDSubOp*)':
> > osd/ReplicatedPG.cc:3021: FAILED assert(r == 0)
> >  1: (OSD::dequeue_op(PG*)+0x344) [0x4e6fd4]
> >  2: (ThreadPool::worker()+0x28f) [0x5b5a9f]
> >  3: (ThreadPool::WorkThread::entry()+0xd) [0x4f0acd]
> >  4: (Thread::_entry_func(void*)+0xa) [0x46c0ca]
> >  5: (()+0x6a3a) [0x7efff4f12a3a]
> >  6: (clone()+0x6d) [0x7efff413077d]
> > 
> > Please let me if you need more information. I still keep the
> > environment for collecting more data for debug.
> > 
> > Thanks.
> > 

[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux