Some OSD and MDS crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ok, in current/meta on osd 20 and osd 23, please attach all files matching

^osdmap.13258.*

There should be one such file on each osd. (should look something like
osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
you'll want to use find).

What version of ceph is running on your mons?  How many mons do you have?
-Sam

On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
<pierre.blondeau at unicaen.fr> wrote:
> Hi,
>
> I do it, the log files are available here :
> https://blondeau.users.greyc.fr/cephlog/debug20/
>
> The OSD's files are really big +/- 80M .
>
> After starting the osd.20 some other osd crash. I pass from 31 osd up to 16.
> I remark that after this the number of down+peering PG decrease from 367 to
> 248. It's "normal" ? May be it's temporary, the time that the cluster
> verifies all the PG ?
>
> Regards
> Pierre
>
> Le 02/07/2014 19:16, Samuel Just a ?crit :
>
>> You should add
>>
>> debug osd = 20
>> debug filestore = 20
>> debug ms = 1
>>
>> to the [osd] section of the ceph.conf and restart the osds.  I'd like
>> all three logs if possible.
>>
>> Thanks
>> -Sam
>>
>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>> <pierre.blondeau at unicaen.fr> wrote:
>>>
>>> Yes, but how i do that ?
>>>
>>> With a command like that ?
>>>
>>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
>>> --debug-ms
>>> 1'
>>>
>>> By modify the /etc/ceph/ceph.conf ? This file is really poor because I
>>> use
>>> udev detection.
>>>
>>> When I have made these changes, you want the three log files or only
>>> osd.20's ?
>>>
>>> Thank you so much for the help
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 01/07/2014 23:51, Samuel Just a ?crit :
>>>
>>>> Can you reproduce with
>>>> debug osd = 20
>>>> debug filestore = 20
>>>> debug ms = 1
>>>> ?
>>>> -Sam
>>>>
>>>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I join :
>>>>>    - osd.20 is one of osd that I detect which makes crash other OSD.
>>>>>    - osd.23 is one of osd which crash when i start osd.20
>>>>>    - mds, is one of my MDS
>>>>>
>>>>> I cut log file because they are to big but. All is here :
>>>>> https://blondeau.users.greyc.fr/cephlog/
>>>>>
>>>>> Regards
>>>>>
>>>>> Le 30/06/2014 17:35, Gregory Farnum a ?crit :
>>>>>
>>>>>> What's the backtrace from the crashing OSDs?
>>>>>>
>>>>>> Keep in mind that as a dev release, it's generally best not to upgrade
>>>>>> to unnamed versions like 0.82 (but it's probably too late to go back
>>>>>> now).
>>>>>
>>>>>
>>>>> I will remember it the next time ;)
>>>>>
>>>>>> -Greg
>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>>
>>>>>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
>>>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> After the upgrade to firefly, I have some PG in peering state.
>>>>>>> I seen the output of 0.82 so I try to upgrade for solved my problem.
>>>>>>>
>>>>>>> My three MDS crash and some OSD triggers a chain reaction that kills
>>>>>>> other
>>>>>>> OSD.
>>>>>>> I think my MDS will not start because of the metadata are on the OSD.
>>>>>>>
>>>>>>> I have 36 OSD on three servers and I identified 5 OSD which makes
>>>>>>> crash
>>>>>>> others. If i not start their, the cluster passe in reconstructive
>>>>>>> state
>>>>>>> with
>>>>>>> 31 OSD but i have 378 in down+peering state.
>>>>>>>
>>>>>>> How can I do ? Would you more information ( os, crash log, etc ... )
>>>>>>> ?
>>>>>>>
>>>>>>> Regards
>
>
> --
> ----------------------------------------------
> Pierre BLONDEAU
> Administrateur Syst?mes & r?seaux
> Universit? de Caen
> Laboratoire GREYC, D?partement d'informatique
>
> tel     : 02 31 56 75 42
> bureau  : Campus 2, Science 3, 406
> ----------------------------------------------
>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux