Re: OSD failure on start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Actually, that bug did not exist in 48.1, must have been something
different.  Was the the node you had the trouble with the pg logs on?
-Sam

On Wed, Feb 13, 2013 at 2:47 PM, Mandell Degerness
<mandell@xxxxxxxxxxxxxxx> wrote:
> Thanks.  I'm glad to hear it is fixed in new version.  Wiping the OSD worked.
>
> On Wed, Feb 13, 2013 at 2:08 PM, Mike Dawson
> <mike.dawson@xxxxxxxxxxxxxxxx> wrote:
>> Mandell,
>>
>> A few of us saw a similar failure on 0.56.1.
>>
>> http://tracker.ceph.com/issues/3770
>>
>> Sam Just patched the issue for 0.56.2. My understanding is Sam's patch
>> prevents the issue in the future, but doesn't repair a previously damaged
>> OSD.
>>
>> If you have good replication (or a good backup), I have had luck removing
>> the affected OSD, formatting, and re-adding it. I believe Sam may have a
>> manual process to fix it if you can't wipe this OSD.
>>
>> Good Luck,
>> Mike
>>
>>
>>
>> On 2/13/2013 2:57 PM, Mandell Degerness wrote:
>>>
>>> I'm getting this error on one of my OSD's when I try to start it.
>>>
>>> I can gather more complete log data if no-one recognizes the error from
>>> this:
>>>
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.612847
>>> 7f4f607e7780  0 filestore(/mnt/osd96) mount found snaps <>
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.615147
>>> 7f4f607e7780  0 filestore(/mnt/osd96) mount: enabling WRITEAHEAD
>>> journal mode: btrfs not detected
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.658965
>>> 7f4f607e7780  1 journal _open /mnt/osd96/journal fd 30: 8589934592
>>> bytes, block size 4096 bytes, directio = 1, aio = 0
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.720091
>>> 7f4f607e7780  1 journal _open /mnt/osd96/journal fd 30: 8589934592
>>> bytes, block size 4096 bytes, directio = 1, aio = 0
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: 2013-02-13 19:30:04.721871
>>> 7f4f607e7780 -1 osd/OSD.cc: In function 'OSDMapRef
>>> OSD::get_map(epoch_t)' thread 7f4f607e7780 time 2013-02-13
>>> 19:30:04.721278
>>> osd/OSD.cc: 4029: FAILED assert(_get_map_bl(epoch, bl))
>>>
>>>   ceph version 0.48.1argonaut
>>> (commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
>>>   1: (OSD::get_map(unsigned int)+0x560) [0x7f4f60a411e0]
>>>   2: (OSD::init()+0x5a3) [0x7f4f60a53ce3]
>>>   3: (main()+0x4462) [0x7f4f6096d182]
>>>   4: (__libc_start_main()+0xfd) [0x7f4f5e64b26d]
>>>   5: (()+0x16e829) [0x7f4f60968829]
>>>   NOTE: a copy of the executable, or `objdump -rdS <executable>` is
>>> needed to interpret this.
>>> Feb 13 19:30:04 node-192-168-8-14 ceph-osd: --- begin dump of recent
>>> events ---
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux