Some OSD and MDS crash

pierre.blondeau@xxxxxxxxxx (Pierre BLONDEAU) · Wed, 16 Jul 2014 15:21:30 +0200

Hi,

After the repair process, i have :
1926 active+clean
    2 active+clean+inconsistent

This two PGs seem to be on the same osd ( #34 ):
# ceph pg dump | grep inconsistent
dumped all in format plain
0.2e    4       0       0       0       8388660 4       4 
active+clean+inconsistent       2014-07-16 11:39:43.819631      9463'4 
438411:133968   [34,4]  34      [34,4]  34      9463'4  2014-07-16 
04:52:54.417333      9463'4  2014-07-11 09:29:22.041717
0.1ed   5       0       0       0       8388623 10      10 
active+clean+inconsistent       2014-07-16 11:39:45.820142      9712'10 
438411:144792   [34,2]  34      [34,2]  34      9712'10 2014-07-16 
09:12:44.742488      9712'10 2014-07-10 21:57:11.345241

It's can explain why my MDS won't to start ? If i remove ( or shutdown ) 
this OSD, it's can solved my problem ?

Regards.

Le 10/07/2014 11:51, Pierre BLONDEAU a ?crit :
> Hi,
>
> Great.
>
> All my OSD restart :
> osdmap e438044: 36 osds: 36 up, 36 in
>
> All PG page are active and some in recovery :
> 1604040/49575206 objects degraded (3.236%)
>   1780 active+clean
>   17 active+degraded+remapped+backfilling
>   61 active+degraded+remapped+wait_backfill
>   11 active+clean+scrubbing+deep
>   34 active+remapped+backfilling
>   21 active+remapped+wait_backfill
>   4 active+clean+replay
>
> But all mds crash. Logs are here :
> https://blondeau.users.greyc.fr/cephlog/legacy/
>
> In any case, thank you very much for your help.
>
> Pierre
>
> Le 09/07/2014 19:34, Joao Eduardo Luis a ?crit :
>> On 07/09/2014 02:22 PM, Pierre BLONDEAU wrote:
>>> Hi,
>>>
>>> There is any chance to restore my data ?
>>
>> Okay, I talked to Sam and here's what you could try before anything else:
>>
>> - Make sure you have everything running on the same version.
>> - unset the the chooseleaf_vary_r flag -- this can be accomplished by
>> setting tunables to legacy.
>> - have the osds join in the cluster
>> - you should then either upgrade to firefly (if you haven't done so by
>> now) or wait for the point-release before you move on to setting
>> tunables to optimal again.
>>
>> Let us know how it goes.
>>
>>    -Joao
>>
>>
>>>
>>> Regards
>>> Pierre
>>>
>>> Le 07/07/2014 15:42, Pierre BLONDEAU a ?crit :
>>>> No chance to have those logs and even less in debug mode. I do this
>>>> change 3 weeks ago.
>>>>
>>>> I put all my log here if it's can help :
>>>> https://blondeau.users.greyc.fr/cephlog/all/
>>>>
>>>> I have a chance to recover my +/- 20TB of data ?
>>>>
>>>> Regards
>>>>
>>>> Le 03/07/2014 21:48, Joao Luis a ?crit :
>>>>> Do those logs have a higher debugging level than the default? If not
>>>>> nevermind as they will not have enough information. If they do
>>>>> however,
>>>>> we'd be interested in the portion around the moment you set the
>>>>> tunables. Say, before the upgrade and a bit after you set the tunable.
>>>>> If you want to be finer grained, then ideally it would be the moment
>>>>> where those maps were created, but you'd have to grep the logs for
>>>>> that.
>>>>>
>>>>> Or drop the logs somewhere and I'll take a look.
>>>>>
>>>>>    -Joao
>>>>>
>>>>> On Jul 3, 2014 5:48 PM, "Pierre BLONDEAU" <pierre.blondeau at unicaen.fr
>>>>> <mailto:pierre.blondeau at unicaen.fr>> wrote:
>>>>>
>>>>>     Le 03/07/2014 13:49, Joao Eduardo Luis a ?crit :
>>>>>
>>>>>         On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote:
>>>>>
>>>>>             Le 03/07/2014 00:55, Samuel Just a ?crit :
>>>>>
>>>>>                 Ah,
>>>>>
>>>>>                 ~/logs ? for i in 20 23; do ../ceph/src/osdmaptool
>>>>>                 --export-crush
>>>>>                 /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d
>>>>>                 /tmp/crush$i >
>>>>>                 /tmp/crush$i.d; done; diff /tmp/crush20.d
>>>>> /tmp/crush23.d
>>>>>                 ../ceph/src/osdmaptool: osdmap file
>>>>>                 'osd-20_osdmap.13258__0___4E62BB79__none'
>>>>>                 ../ceph/src/osdmaptool: exported crush map to
>>>>> /tmp/crush20
>>>>>                 ../ceph/src/osdmaptool: osdmap file
>>>>>                 'osd-23_osdmap.13258__0___4E62BB79__none'
>>>>>                 ../ceph/src/osdmaptool: exported crush map to
>>>>> /tmp/crush23
>>>>>                 6d5
>>>>>                 < tunable chooseleaf_vary_r 1
>>>>>
>>>>>                   Looks like the chooseleaf_vary_r tunable somehow
>>>>> ended
>>>>>                 up divergent?
>>>>>
>>>>>
>>>>>         The only thing that comes to mind that could cause this is
>>>>> if we
>>>>>         changed
>>>>>         the leader's in-memory map, proposed it, it failed, and only
>>>>> the
>>>>>         leader
>>>>>         got to write the map to disk somehow.  This happened once on a
>>>>>         totally
>>>>>         different issue (although I can't pinpoint right now which).
>>>>>
>>>>>         In such a scenario, the leader would serve the incorrect
>>>>> osdmap to
>>>>>         whoever asked osdmaps from it, the remaining quorum would
>>>>> serve the
>>>>>         correct osdmaps to all the others.  This could cause this
>>>>>         divergence. Or
>>>>>         it could be something else.
>>>>>
>>>>>         Are there logs for the monitors for the timeframe this may
>>>>> have
>>>>>         happened
>>>>>         in?
>>>>>
>>>>>
>>>>>     Which exactly timeframe you want ? I have 7 days of logs, I should
>>>>>     have informations about the upgrade from firefly to 0.82.
>>>>>     Which mon's log do you want ? Three ?
>>>>>
>>>>>     Regards
>>>>>
>>>>>             -Joao
>>>>>
>>>>>
>>>>>                 Pierre: do you recall how and when that got set?
>>>>>
>>>>>
>>>>>             I am not sure to understand, but if I good remember after
>>>>>             the update in
>>>>>             firefly, I was in state : HEALTH_WARN crush map has legacy
>>>>>             tunables and
>>>>>             I see "feature set mismatch" in log.
>>>>>
>>>>>             So if I good remeber, i do : ceph osd crush tunables
>>>>> optimal
>>>>>             for the
>>>>>             problem of "crush map" and I update my client and server
>>>>>             kernel to
>>>>>             3.16rc.
>>>>>
>>>>>             It's could be that ?
>>>>>
>>>>>             Pierre
>>>>>
>>>>>                 -Sam
>>>>>
>>>>>                 On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just
>>>>>                 <sam.just at inktank.com <mailto:sam.just at inktank.com>>
>>>>>                 wrote:
>>>>>
>>>>>                     Yeah, divergent osdmaps:
>>>>>                     555ed048e73024687fc8b106a570db__4f
>>>>>                       osd-20_osdmap.13258__0___4E62BB79__none
>>>>>                     6037911f31dc3c18b05499d24dcdbe__5c
>>>>>                       osd-23_osdmap.13258__0___4E62BB79__none
>>>>>
>>>>>                     Joao: thoughts?
>>>>>                     -Sam
>>>>>
>>>>>                     On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>>>>>                     <pierre.blondeau at unicaen.fr
>>>>>                     <mailto:pierre.blondeau at unicaen.fr>> wrote:
>>>>>
>>>>>                         The files
>>>>>
>>>>>                         When I upgrade :
>>>>>                            ceph-deploy install --stable firefly
>>>>> servers...
>>>>>                            on each servers service ceph restart mon
>>>>>                            on each servers service ceph restart osd
>>>>>                            on each servers service ceph restart mds
>>>>>
>>>>>                         I upgraded from emperor to firefly. After
>>>>>                         repair, remap, replace,
>>>>>                         etc ... I
>>>>>                         have some PG which pass in peering state.
>>>>>
>>>>>                         I thought why not try the version 0.82, it
>>>>> could
>>>>>                         solve my problem. (
>>>>>                         It's my mistake ). So, I upgrade from
>>>>> firefly to
>>>>>                         0.83 with :
>>>>>                            ceph-deploy install --testing servers...
>>>>>                            ..
>>>>>
>>>>>                         Now, all programs are in version 0.82.
>>>>>                         I have 3 mons, 36 OSD and 3 mds.
>>>>>
>>>>>                         Pierre
>>>>>
>>>>>                         PS : I find also
>>>>>                         "inc\uosdmap.13258__0___469271DE__none" on
>>>>> each meta
>>>>>                         directory.
>>>>>
>>>>>                         Le 03/07/2014 00:10, Samuel Just a ?crit :
>>>>>
>>>>>                             Also, what version did you upgrade from,
>>>>> and
>>>>>                             how did you upgrade?
>>>>>                             -Sam
>>>>>
>>>>>                             On Wed, Jul 2, 2014 at 3:09 PM, Samuel
>>>>> Just
>>>>>                             <sam.just at inktank.com
>>>>>                             <mailto:sam.just at inktank.com>>
>>>>>                             wrote:
>>>>>
>>>>>
>>>>>                                 Ok, in current/meta on osd 20 and osd
>>>>>                                 23, please attach all files
>>>>>                                 matching
>>>>>
>>>>>                                 ^osdmap.13258.*
>>>>>
>>>>>                                 There should be one such file on each
>>>>>                                 osd. (should look something
>>>>>                                 like
>>>>>                                 osdmap.6__0_FD6E4C01__none, probably
>>>>>                                 hashed into a subdirectory,
>>>>>                                 you'll want to use find).
>>>>>
>>>>>                                 What version of ceph is running on
>>>>> your
>>>>>                                 mons?  How many mons do
>>>>>                                 you have?
>>>>>                                 -Sam
>>>>>
>>>>>                                 On Wed, Jul 2, 2014 at 2:21 PM, Pierre
>>>>>                                 BLONDEAU
>>>>>                                 <pierre.blondeau at unicaen.fr
>>>>>                                 <mailto:pierre.blondeau at unicaen.fr>>
>>>>> wrote:
>>>>>
>>>>>
>>>>>                                     Hi,
>>>>>
>>>>>                                     I do it, the log files are
>>>>> available
>>>>>                                     here :
>>>>>
>>>>> https://blondeau.users.greyc.__fr/cephlog/debug20/
>>>>>
>>>>> <https://blondeau.users.greyc.fr/cephlog/debug20/>
>>>>>
>>>>>                                     The OSD's files are really big +/-
>>>>> 80M .
>>>>>
>>>>>                                     After starting the osd.20 some
>>>>> other
>>>>>                                     osd crash. I pass from 31
>>>>>                                     osd up to
>>>>>                                     16.
>>>>>                                     I remark that after this the
>>>>> number
>>>>>                                     of down+peering PG decrease
>>>>>                                     from 367
>>>>>                                     to
>>>>>                                     248. It's "normal" ? May be it's
>>>>>                                     temporary, the time that the
>>>>>                                     cluster
>>>>>                                     verifies all the PG ?
>>>>>
>>>>>                                     Regards
>>>>>                                     Pierre
>>>>>
>>>>>                                     Le 02/07/2014 19:16, Samuel Just a
>>>>>                                     ?crit :
>>>>>
>>>>>                                         You should add
>>>>>
>>>>>                                         debug osd = 20
>>>>>                                         debug filestore = 20
>>>>>                                         debug ms = 1
>>>>>
>>>>>                                         to the [osd] section of the
>>>>>                                         ceph.conf and restart the
>>>>> osds.  I'd
>>>>>                                         like
>>>>>                                         all three logs if possible.
>>>>>
>>>>>                                         Thanks
>>>>>                                         -Sam
>>>>>
>>>>>                                         On Wed, Jul 2, 2014 at 5:03
>>>>> AM,
>>>>>                                         Pierre BLONDEAU
>>>>>                                         <pierre.blondeau at unicaen.fr
>>>>>
>>>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>>>                                         wrote:
>>>>>
>>>>>
>>>>>
>>>>>                                             Yes, but how i do that ?
>>>>>
>>>>>                                             With a command like that ?
>>>>>
>>>>>                                             ceph tell osd.20
>>>>> injectargs
>>>>>                                             '--debug-osd 20
>>>>>                                             --debug-filestore 20
>>>>>                                             --debug-ms
>>>>>                                             1'
>>>>>
>>>>>                                             By modify the
>>>>>                                             /etc/ceph/ceph.conf ? This
>>>>>                                             file is really poor
>>>>>                                             because I
>>>>>                                             use
>>>>>                                             udev detection.
>>>>>
>>>>>                                             When I have made these
>>>>>                                             changes, you want the
>>>>> three
>>>>>                                             log files or
>>>>>                                             only
>>>>>                                             osd.20's ?
>>>>>
>>>>>                                             Thank you so much for the
>>>>> help
>>>>>
>>>>>                                             Regards
>>>>>                                             Pierre
>>>>>
>>>>>                                             Le 01/07/2014 23:51,
>>>>> Samuel
>>>>>                                             Just a ?crit :
>>>>>
>>>>>                                                 Can you reproduce with
>>>>>                                                 debug osd = 20
>>>>>                                                 debug filestore = 20
>>>>>                                                 debug ms = 1
>>>>>                                                 ?
>>>>>                                                 -Sam
>>>>>
>>>>>                                                 On Tue, Jul 1, 2014 at
>>>>>                                                 1:21 AM, Pierre
>>>>> BLONDEAU
>>>>>
>>>>> <pierre.blondeau at unicaen.fr
>>>>>
>>>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>>>                                                 wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                     Hi,
>>>>>
>>>>>                                                     I join :
>>>>>                                                           - osd.20 is
>>>>>                                                     one of osd that I
>>>>>                                                     detect which makes
>>>>> crash
>>>>>                                                     other
>>>>>                                                     OSD.
>>>>>                                                           - osd.23 is
>>>>>                                                     one of osd which
>>>>>                                                     crash when i start
>>>>>                                                     osd.20
>>>>>                                                           - mds, is
>>>>> one
>>>>>                                                     of my MDS
>>>>>
>>>>>                                                     I cut log file
>>>>>                                                     because they
>>>>> are to
>>>>>                                                     big but. All is
>>>>> here :
>>>>>
>>>>> https://blondeau.users.greyc.__fr/cephlog/
>>>>>
>>>>> <https://blondeau.users.greyc.fr/cephlog/>
>>>>>
>>>>>                                                     Regards
>>>>>
>>>>>                                                     Le 30/06/2014
>>>>> 17:35,
>>>>>                                                     Gregory Farnum a
>>>>> ?crit :
>>>>>
>>>>>                                                         What's the
>>>>>                                                         backtrace from
>>>>>                                                         the crashing
>>>>> OSDs?
>>>>>
>>>>>                                                         Keep in mind
>>>>>                                                         that as a dev
>>>>>                                                         release, it's
>>>>>                                                         generally best
>>>>>                                                         not to
>>>>>                                                         upgrade
>>>>>                                                         to unnamed
>>>>>                                                         versions like
>>>>>                                                         0.82 (but it's
>>>>>                                                         probably too
>>>>> late
>>>>>                                                         to go
>>>>>                                                         back
>>>>>                                                         now).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                                     I will remember it
>>>>>                                                     the next time ;)
>>>>>
>>>>>                                                         -Greg
>>>>>                                                         Software
>>>>>                                                         Engineer #42 @
>>>>>
>>>>> http://inktank.com
>>>>>                                                         |
>>>>> http://ceph.com
>>>>>
>>>>>                                                         On Mon, Jun
>>>>> 30,
>>>>>                                                         2014 at 8:06
>>>>> AM,
>>>>>                                                         Pierre
>>>>> BLONDEAU
>>>>>
>>>>> <pierre.blondeau at unicaen.fr
>>>>>
>>>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>>>                                                         wrote:
>>>>>
>>>>>
>>>>>
>>>>>                                                             Hi,
>>>>>
>>>>>                                                             After the
>>>>>                                                             upgrade to
>>>>>                                                             firefly, I
>>>>>                                                             have
>>>>> some PG
>>>>>                                                             in peering
>>>>>                                                             state.
>>>>>                                                             I seen the
>>>>>                                                             output of
>>>>>                                                             0.82 so I
>>>>>                                                             try to
>>>>>                                                             upgrade
>>>>> for
>>>>>                                                             solved my
>>>>>                                                             problem.
>>>>>
>>>>>                                                             My three
>>>>> MDS
>>>>>                                                             crash and
>>>>>                                                             some OSD
>>>>>                                                             triggers a
>>>>>                                                             chain
>>>>> reaction
>>>>>                                                             that
>>>>>                                                             kills
>>>>>                                                             other
>>>>>                                                             OSD.
>>>>>                                                             I think my
>>>>>                                                             MDS will
>>>>> not
>>>>>                                                             start
>>>>>                                                             because of
>>>>>                                                             the
>>>>> metadata are
>>>>>                                                             on the
>>>>>                                                             OSD.
>>>>>
>>>>>                                                             I have 36
>>>>>                                                             OSD on
>>>>> three
>>>>>                                                             servers
>>>>> and
>>>>>                                                             I
>>>>> identified
>>>>>                                                             5 OSD
>>>>> which
>>>>>                                                             makes
>>>>>                                                             crash
>>>>>                                                             others.
>>>>> If i
>>>>>                                                             not start
>>>>>                                                             their, the
>>>>>                                                             cluster
>>>>> passe in
>>>>>
>>>>> reconstructive
>>>>>                                                             state
>>>>>                                                             with
>>>>>                                                             31 OSD
>>>>> but i
>>>>>                                                             have
>>>>> 378 in
>>>>>
>>>>> down+peering
>>>>>                                                             state.
>>>>>
>>>>>                                                             How can
>>>>> I do
>>>>>                                                             ? Would
>>>>> you
>>>>>                                                             more
>>>>>
>>>>> information
>>>>>                                                             ( os,
>>>>> crash log,
>>>>>                                                             etc ...
>>>>>                                                             )
>>>>>                                                             ?
>>>>>
>>>>>                                                             Regards
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                                     --
>>>>>
>>>>> ------------------------------__----------------
>>>>>                                     Pierre BLONDEAU
>>>>>                                     Administrateur Syst?mes & r?seaux
>>>>>                                     Universit? de Caen
>>>>>                                     Laboratoire GREYC, D?partement
>>>>>                                     d'informatique
>>>>>
>>>>>                                     tel     : 02 31 56 75 42
>>>>>                                     bureau  : Campus 2, Science 3, 406
>>>>>
>>>>> ------------------------------__----------------
>>>>>
>>>>>
>>>>>
>>>>>                         --
>>>>>
>>>>> ------------------------------__----------------
>>>>>                         Pierre BLONDEAU
>>>>>                         Administrateur Syst?mes & r?seaux
>>>>>                         Universit? de Caen
>>>>>                         Laboratoire GREYC, D?partement d'informatique
>>>>>
>>>>>                         tel     : 02 31 56 75 42
>>>>>                         bureau  : Campus 2, Science 3, 406
>>>>>
>>>>> ------------------------------__----------------
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>     --
>>>>>     ------------------------------__----------------
>>>>>     Pierre BLONDEAU
>>>>>     Administrateur Syst?mes & r?seaux
>>>>>     Universit? de Caen
>>>>>     Laboratoire GREYC, D?partement d'informatique
>>>>>
>>>>>     tel     : 02 31 56 75 42
>>>>>     bureau  : Campus 2, Science 3, 406
>>>>>     ------------------------------__----------------
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users at lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>>
>>
>>
>
>

-- 
----------------------------------------------
Pierre BLONDEAU
Administrateur Syst?mes & r?seaux
Universit? de Caen
Laboratoire GREYC, D?partement d'informatique

tel	: 02 31 56 75 42
bureau	: Campus 2, Science 3, 406
----------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2947 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20140716/938ec433/attachment.bin>