Some OSD and MDS crash

pierre.blondeau@xxxxxxxxxx (Pierre BLONDEAU) · Thu, 03 Jul 2014 18:48:04 +0200

Le 03/07/2014 13:49, Joao Eduardo Luis a ?crit :
> On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote:
>> Le 03/07/2014 00:55, Samuel Just a ?crit :
>>> Ah,
>>>
>>> ~/logs ? for i in 20 23; do ../ceph/src/osdmaptool --export-crush
>>> /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d /tmp/crush$i >
>>> /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
>>> ../ceph/src/osdmaptool: osdmap file
>>> 'osd-20_osdmap.13258__0_4E62BB79__none'
>>> ../ceph/src/osdmaptool: exported crush map to /tmp/crush20
>>> ../ceph/src/osdmaptool: osdmap file
>>> 'osd-23_osdmap.13258__0_4E62BB79__none'
>>> ../ceph/src/osdmaptool: exported crush map to /tmp/crush23
>>> 6d5
>>> < tunable chooseleaf_vary_r 1
>>>
>>>  Looks like the chooseleaf_vary_r tunable somehow ended up divergent?
>
> The only thing that comes to mind that could cause this is if we changed
> the leader's in-memory map, proposed it, it failed, and only the leader
> got to write the map to disk somehow.  This happened once on a totally
> different issue (although I can't pinpoint right now which).
>
> In such a scenario, the leader would serve the incorrect osdmap to
> whoever asked osdmaps from it, the remaining quorum would serve the
> correct osdmaps to all the others.  This could cause this divergence. Or
> it could be something else.
>
> Are there logs for the monitors for the timeframe this may have happened
> in?

Which exactly timeframe you want ? I have 7 days of logs, I should have 
informations about the upgrade from firefly to 0.82.
Which mon's log do you want ? Three ?

Regards

>    -Joao
>
>>>
>>> Pierre: do you recall how and when that got set?
>>
>> I am not sure to understand, but if I good remember after the update in
>> firefly, I was in state : HEALTH_WARN crush map has legacy tunables and
>> I see "feature set mismatch" in log.
>>
>> So if I good remeber, i do : ceph osd crush tunables optimal for the
>> problem of "crush map" and I update my client and server kernel to
>> 3.16rc.
>>
>> It's could be that ?
>>
>> Pierre
>>
>>> -Sam
>>>
>>> On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just <sam.just at inktank.com>
>>> wrote:
>>>> Yeah, divergent osdmaps:
>>>> 555ed048e73024687fc8b106a570db4f  osd-20_osdmap.13258__0_4E62BB79__none
>>>> 6037911f31dc3c18b05499d24dcdbe5c  osd-23_osdmap.13258__0_4E62BB79__none
>>>>
>>>> Joao: thoughts?
>>>> -Sam
>>>>
>>>> On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>> The files
>>>>>
>>>>> When I upgrade :
>>>>>   ceph-deploy install --stable firefly servers...
>>>>>   on each servers service ceph restart mon
>>>>>   on each servers service ceph restart osd
>>>>>   on each servers service ceph restart mds
>>>>>
>>>>> I upgraded from emperor to firefly. After repair, remap, replace,
>>>>> etc ... I
>>>>> have some PG which pass in peering state.
>>>>>
>>>>> I thought why not try the version 0.82, it could solve my problem. (
>>>>> It's my mistake ). So, I upgrade from firefly to 0.83 with :
>>>>>   ceph-deploy install --testing servers...
>>>>>   ..
>>>>>
>>>>> Now, all programs are in version 0.82.
>>>>> I have 3 mons, 36 OSD and 3 mds.
>>>>>
>>>>> Pierre
>>>>>
>>>>> PS : I find also "inc\uosdmap.13258__0_469271DE__none" on each meta
>>>>> directory.
>>>>>
>>>>> Le 03/07/2014 00:10, Samuel Just a ?crit :
>>>>>
>>>>>> Also, what version did you upgrade from, and how did you upgrade?
>>>>>> -Sam
>>>>>>
>>>>>> On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just <sam.just at inktank.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> Ok, in current/meta on osd 20 and osd 23, please attach all files
>>>>>>> matching
>>>>>>>
>>>>>>> ^osdmap.13258.*
>>>>>>>
>>>>>>> There should be one such file on each osd. (should look something
>>>>>>> like
>>>>>>> osdmap.6__0_FD6E4C01__none, probably hashed into a subdirectory,
>>>>>>> you'll want to use find).
>>>>>>>
>>>>>>> What version of ceph is running on your mons?  How many mons do
>>>>>>> you have?
>>>>>>> -Sam
>>>>>>>
>>>>>>> On Wed, Jul 2, 2014 at 2:21 PM, Pierre BLONDEAU
>>>>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I do it, the log files are available here :
>>>>>>>> https://blondeau.users.greyc.fr/cephlog/debug20/
>>>>>>>>
>>>>>>>> The OSD's files are really big +/- 80M .
>>>>>>>>
>>>>>>>> After starting the osd.20 some other osd crash. I pass from 31
>>>>>>>> osd up to
>>>>>>>> 16.
>>>>>>>> I remark that after this the number of down+peering PG decrease
>>>>>>>> from 367
>>>>>>>> to
>>>>>>>> 248. It's "normal" ? May be it's temporary, the time that the
>>>>>>>> cluster
>>>>>>>> verifies all the PG ?
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Pierre
>>>>>>>>
>>>>>>>> Le 02/07/2014 19:16, Samuel Just a ?crit :
>>>>>>>>
>>>>>>>>> You should add
>>>>>>>>>
>>>>>>>>> debug osd = 20
>>>>>>>>> debug filestore = 20
>>>>>>>>> debug ms = 1
>>>>>>>>>
>>>>>>>>> to the [osd] section of the ceph.conf and restart the osds.  I'd
>>>>>>>>> like
>>>>>>>>> all three logs if possible.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> -Sam
>>>>>>>>>
>>>>>>>>> On Wed, Jul 2, 2014 at 5:03 AM, Pierre BLONDEAU
>>>>>>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, but how i do that ?
>>>>>>>>>>
>>>>>>>>>> With a command like that ?
>>>>>>>>>>
>>>>>>>>>> ceph tell osd.20 injectargs '--debug-osd 20 --debug-filestore 20
>>>>>>>>>> --debug-ms
>>>>>>>>>> 1'
>>>>>>>>>>
>>>>>>>>>> By modify the /etc/ceph/ceph.conf ? This file is really poor
>>>>>>>>>> because I
>>>>>>>>>> use
>>>>>>>>>> udev detection.
>>>>>>>>>>
>>>>>>>>>> When I have made these changes, you want the three log files or
>>>>>>>>>> only
>>>>>>>>>> osd.20's ?
>>>>>>>>>>
>>>>>>>>>> Thank you so much for the help
>>>>>>>>>>
>>>>>>>>>> Regards
>>>>>>>>>> Pierre
>>>>>>>>>>
>>>>>>>>>> Le 01/07/2014 23:51, Samuel Just a ?crit :
>>>>>>>>>>
>>>>>>>>>>> Can you reproduce with
>>>>>>>>>>> debug osd = 20
>>>>>>>>>>> debug filestore = 20
>>>>>>>>>>> debug ms = 1
>>>>>>>>>>> ?
>>>>>>>>>>> -Sam
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Jul 1, 2014 at 1:21 AM, Pierre BLONDEAU
>>>>>>>>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I join :
>>>>>>>>>>>>      - osd.20 is one of osd that I detect which makes crash
>>>>>>>>>>>> other
>>>>>>>>>>>> OSD.
>>>>>>>>>>>>      - osd.23 is one of osd which crash when i start osd.20
>>>>>>>>>>>>      - mds, is one of my MDS
>>>>>>>>>>>>
>>>>>>>>>>>> I cut log file because they are to big but. All is here :
>>>>>>>>>>>> https://blondeau.users.greyc.fr/cephlog/
>>>>>>>>>>>>
>>>>>>>>>>>> Regards
>>>>>>>>>>>>
>>>>>>>>>>>> Le 30/06/2014 17:35, Gregory Farnum a ?crit :
>>>>>>>>>>>>
>>>>>>>>>>>>> What's the backtrace from the crashing OSDs?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Keep in mind that as a dev release, it's generally best not to
>>>>>>>>>>>>> upgrade
>>>>>>>>>>>>> to unnamed versions like 0.82 (but it's probably too late
>>>>>>>>>>>>> to go
>>>>>>>>>>>>> back
>>>>>>>>>>>>> now).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I will remember it the next time ;)
>>>>>>>>>>>>
>>>>>>>>>>>>> -Greg
>>>>>>>>>>>>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jun 30, 2014 at 8:06 AM, Pierre BLONDEAU
>>>>>>>>>>>>> <pierre.blondeau at unicaen.fr> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After the upgrade to firefly, I have some PG in peering
>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>> I seen the output of 0.82 so I try to upgrade for solved my
>>>>>>>>>>>>>> problem.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My three MDS crash and some OSD triggers a chain reaction
>>>>>>>>>>>>>> that
>>>>>>>>>>>>>> kills
>>>>>>>>>>>>>> other
>>>>>>>>>>>>>> OSD.
>>>>>>>>>>>>>> I think my MDS will not start because of the metadata are
>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>> OSD.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have 36 OSD on three servers and I identified 5 OSD which
>>>>>>>>>>>>>> makes
>>>>>>>>>>>>>> crash
>>>>>>>>>>>>>> others. If i not start their, the cluster passe in
>>>>>>>>>>>>>> reconstructive
>>>>>>>>>>>>>> state
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> 31 OSD but i have 378 in down+peering state.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> How can I do ? Would you more information ( os, crash log,
>>>>>>>>>>>>>> etc ...
>>>>>>>>>>>>>> )
>>>>>>>>>>>>>> ?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regards
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ----------------------------------------------
>>>>>>>> Pierre BLONDEAU
>>>>>>>> Administrateur Syst?mes & r?seaux
>>>>>>>> Universit? de Caen
>>>>>>>> Laboratoire GREYC, D?partement d'informatique
>>>>>>>>
>>>>>>>> tel     : 02 31 56 75 42
>>>>>>>> bureau  : Campus 2, Science 3, 406
>>>>>>>> ----------------------------------------------
>>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> ----------------------------------------------
>>>>> Pierre BLONDEAU
>>>>> Administrateur Syst?mes & r?seaux
>>>>> Universit? de Caen
>>>>> Laboratoire GREYC, D?partement d'informatique
>>>>>
>>>>> tel     : 02 31 56 75 42
>>>>> bureau  : Campus 2, Science 3, 406
>>>>> ----------------------------------------------
>>
>>
>
>

-- 
----------------------------------------------
Pierre BLONDEAU
Administrateur Syst?mes & r?seaux
Universit? de Caen
Laboratoire GREYC, D?partement d'informatique

tel	: 02 31 56 75 42
bureau	: Campus 2, Science 3, 406
----------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2947 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140703/808c7808/attachment.bin>