Re: Problems starting up OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 22, 2014 at 11:39 AM, Jeffrey Ollie <jeff@xxxxxxxxxx> wrote:
> On Sat, Nov 22, 2014 at 1:22 PM, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>> Can you post the OSD log somewhere? It should have a few more details
>> about what's going on here. (This backtrace looks like it's crashing
>> in a call to phreads, which is a little unusual.)
>
> Uploaded to Google Drive:
>
> https://drive.google.com/file/d/0B5VwdTUBhU7UNXFlR1FRRHRVNm8/view?usp=sharing
>
>>
>> On Sat, Nov 22, 2014 at 1:01 PM, Jeffrey Ollie <jeff@xxxxxxxxxx> wrote:
>>> -- One of my OSDs lost network connectivity for a short while.  The OSD
>>> crashed and now when I try and start it back up the process is killed
>>> because of an illegal instruction.  Is there anything that I can do to
>>> get this going again or am I going to need to rebuild it from scratch
>>> (which wouldn't be a completely terrible idea as I set this up with
>>> the journal on the same drive).  This particular OSD is running on
>>> Fedora 21 Beta.
>>>
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: 0> 2014-11-22
>>> 12:23:26.908700 7fdab90ae7c0 -1 *** Caught signal (Illegal
>>> instruction) **
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: in thread 7fdab90ae7c0
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: ceph version 0.87
>>> (
>>> c51c8f9d80fa4e0168aa52685b8de40e42758578)
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: 1:
>>> /usr/bin/ceph-osd() [0x9edd55]
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: 2: (()+0x100d0)
>>> [0x7fdab80740d0]
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: 3:
>>> (pthread_rwlock_unlock()+0x13) [0x7fdab8070153]
>>> Nov 22 12:23:26 home01.ocjtech.us ceph-osd[22977]: 4:
>>> (IndexManager::init_index(coll_t, char const*, unsigned int)+0x513)
>>> [0x8da3b3]

Looks to me like this is the result of us being naughty with rwlock handling:
http://tracker.ceph.com/issues/10085
https://github.com/ceph/ceph/pull/2937

It should be fixed soon, and was probably triggered by the disk
snapshot state being not quite what the OSD expected. If you're able
to build your own packages you can apply the linked patch and it
should start up again for you.
-Greg
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux