Some OSD and MDS crash

joao.luis@xxxxxxxxxxx (Joao Eduardo Luis) · Wed, 09 Jul 2014 18:34:04 +0100

On 07/09/2014 02:22 PM, Pierre BLONDEAU wrote:
> Hi,
>
> There is any chance to restore my data ?

Okay, I talked to Sam and here's what you could try before anything else:

- Make sure you have everything running on the same version.
- unset the the chooseleaf_vary_r flag -- this can be accomplished by 
setting tunables to legacy.
- have the osds join in the cluster
- you should then either upgrade to firefly (if you haven't done so by 
now) or wait for the point-release before you move on to setting 
tunables to optimal again.

Let us know how it goes.

   -Joao

>
> Regards
> Pierre
>
> Le 07/07/2014 15:42, Pierre BLONDEAU a ?crit :
>> No chance to have those logs and even less in debug mode. I do this
>> change 3 weeks ago.
>>
>> I put all my log here if it's can help :
>> https://blondeau.users.greyc.fr/cephlog/all/
>>
>> I have a chance to recover my +/- 20TB of data ?
>>
>> Regards
>>
>> Le 03/07/2014 21:48, Joao Luis a ?crit :
>>> Do those logs have a higher debugging level than the default? If not
>>> nevermind as they will not have enough information. If they do however,
>>> we'd be interested in the portion around the moment you set the
>>> tunables. Say, before the upgrade and a bit after you set the tunable.
>>> If you want to be finer grained, then ideally it would be the moment
>>> where those maps were created, but you'd have to grep the logs for that.
>>>
>>> Or drop the logs somewhere and I'll take a look.
>>>
>>>    -Joao
>>>
>>> On Jul 3, 2014 5:48 PM, "Pierre BLONDEAU" <pierre.blondeau at unicaen.fr
>>> <mailto:pierre.blondeau at unicaen.fr>> wrote:
>>>
>>>     Le 03/07/2014 13:49, Joao Eduardo Luis a ?crit :
>>>
>>>         On 07/03/2014 12:15 AM, Pierre BLONDEAU wrote:
>>>
>>>             Le 03/07/2014 00:55, Samuel Just a ?crit :
>>>
>>>                 Ah,
>>>
>>>                 ~/logs ? for i in 20 23; do ../ceph/src/osdmaptool
>>>                 --export-crush
>>>                 /tmp/crush$i osd-$i*; ../ceph/src/crushtool -d
>>>                 /tmp/crush$i >
>>>                 /tmp/crush$i.d; done; diff /tmp/crush20.d /tmp/crush23.d
>>>                 ../ceph/src/osdmaptool: osdmap file
>>>                 'osd-20_osdmap.13258__0___4E62BB79__none'
>>>                 ../ceph/src/osdmaptool: exported crush map to
>>> /tmp/crush20
>>>                 ../ceph/src/osdmaptool: osdmap file
>>>                 'osd-23_osdmap.13258__0___4E62BB79__none'
>>>                 ../ceph/src/osdmaptool: exported crush map to
>>> /tmp/crush23
>>>                 6d5
>>>                 < tunable chooseleaf_vary_r 1
>>>
>>>                   Looks like the chooseleaf_vary_r tunable somehow ended
>>>                 up divergent?
>>>
>>>
>>>         The only thing that comes to mind that could cause this is if we
>>>         changed
>>>         the leader's in-memory map, proposed it, it failed, and only the
>>>         leader
>>>         got to write the map to disk somehow.  This happened once on a
>>>         totally
>>>         different issue (although I can't pinpoint right now which).
>>>
>>>         In such a scenario, the leader would serve the incorrect
>>> osdmap to
>>>         whoever asked osdmaps from it, the remaining quorum would
>>> serve the
>>>         correct osdmaps to all the others.  This could cause this
>>>         divergence. Or
>>>         it could be something else.
>>>
>>>         Are there logs for the monitors for the timeframe this may have
>>>         happened
>>>         in?
>>>
>>>
>>>     Which exactly timeframe you want ? I have 7 days of logs, I should
>>>     have informations about the upgrade from firefly to 0.82.
>>>     Which mon's log do you want ? Three ?
>>>
>>>     Regards
>>>
>>>             -Joao
>>>
>>>
>>>                 Pierre: do you recall how and when that got set?
>>>
>>>
>>>             I am not sure to understand, but if I good remember after
>>>             the update in
>>>             firefly, I was in state : HEALTH_WARN crush map has legacy
>>>             tunables and
>>>             I see "feature set mismatch" in log.
>>>
>>>             So if I good remeber, i do : ceph osd crush tunables optimal
>>>             for the
>>>             problem of "crush map" and I update my client and server
>>>             kernel to
>>>             3.16rc.
>>>
>>>             It's could be that ?
>>>
>>>             Pierre
>>>
>>>                 -Sam
>>>
>>>                 On Wed, Jul 2, 2014 at 3:43 PM, Samuel Just
>>>                 <sam.just at inktank.com <mailto:sam.just at inktank.com>>
>>>                 wrote:
>>>
>>>                     Yeah, divergent osdmaps:
>>>                     555ed048e73024687fc8b106a570db__4f
>>>                       osd-20_osdmap.13258__0___4E62BB79__none
>>>                     6037911f31dc3c18b05499d24dcdbe__5c
>>>                       osd-23_osdmap.13258__0___4E62BB79__none
>>>
>>>                     Joao: thoughts?
>>>                     -Sam
>>>
>>>                     On Wed, Jul 2, 2014 at 3:39 PM, Pierre BLONDEAU
>>>                     <pierre.blondeau at unicaen.fr
>>>                     <mailto:pierre.blondeau at unicaen.fr>> wrote:
>>>
>>>                         The files
>>>
>>>                         When I upgrade :
>>>                            ceph-deploy install --stable firefly
>>> servers...
>>>                            on each servers service ceph restart mon
>>>                            on each servers service ceph restart osd
>>>                            on each servers service ceph restart mds
>>>
>>>                         I upgraded from emperor to firefly. After
>>>                         repair, remap, replace,
>>>                         etc ... I
>>>                         have some PG which pass in peering state.
>>>
>>>                         I thought why not try the version 0.82, it could
>>>                         solve my problem. (
>>>                         It's my mistake ). So, I upgrade from firefly to
>>>                         0.83 with :
>>>                            ceph-deploy install --testing servers...
>>>                            ..
>>>
>>>                         Now, all programs are in version 0.82.
>>>                         I have 3 mons, 36 OSD and 3 mds.
>>>
>>>                         Pierre
>>>
>>>                         PS : I find also
>>>                         "inc\uosdmap.13258__0___469271DE__none" on
>>> each meta
>>>                         directory.
>>>
>>>                         Le 03/07/2014 00:10, Samuel Just a ?crit :
>>>
>>>                             Also, what version did you upgrade from, and
>>>                             how did you upgrade?
>>>                             -Sam
>>>
>>>                             On Wed, Jul 2, 2014 at 3:09 PM, Samuel Just
>>>                             <sam.just at inktank.com
>>>                             <mailto:sam.just at inktank.com>>
>>>                             wrote:
>>>
>>>
>>>                                 Ok, in current/meta on osd 20 and osd
>>>                                 23, please attach all files
>>>                                 matching
>>>
>>>                                 ^osdmap.13258.*
>>>
>>>                                 There should be one such file on each
>>>                                 osd. (should look something
>>>                                 like
>>>                                 osdmap.6__0_FD6E4C01__none, probably
>>>                                 hashed into a subdirectory,
>>>                                 you'll want to use find).
>>>
>>>                                 What version of ceph is running on your
>>>                                 mons?  How many mons do
>>>                                 you have?
>>>                                 -Sam
>>>
>>>                                 On Wed, Jul 2, 2014 at 2:21 PM, Pierre
>>>                                 BLONDEAU
>>>                                 <pierre.blondeau at unicaen.fr
>>>                                 <mailto:pierre.blondeau at unicaen.fr>>
>>> wrote:
>>>
>>>
>>>                                     Hi,
>>>
>>>                                     I do it, the log files are available
>>>                                     here :
>>>
>>> https://blondeau.users.greyc.__fr/cephlog/debug20/
>>>
>>> <https://blondeau.users.greyc.fr/cephlog/debug20/>
>>>
>>>                                     The OSD's files are really big +/-
>>> 80M .
>>>
>>>                                     After starting the osd.20 some other
>>>                                     osd crash. I pass from 31
>>>                                     osd up to
>>>                                     16.
>>>                                     I remark that after this the number
>>>                                     of down+peering PG decrease
>>>                                     from 367
>>>                                     to
>>>                                     248. It's "normal" ? May be it's
>>>                                     temporary, the time that the
>>>                                     cluster
>>>                                     verifies all the PG ?
>>>
>>>                                     Regards
>>>                                     Pierre
>>>
>>>                                     Le 02/07/2014 19:16, Samuel Just a
>>>                                     ?crit :
>>>
>>>                                         You should add
>>>
>>>                                         debug osd = 20
>>>                                         debug filestore = 20
>>>                                         debug ms = 1
>>>
>>>                                         to the [osd] section of the
>>>                                         ceph.conf and restart the
>>> osds.  I'd
>>>                                         like
>>>                                         all three logs if possible.
>>>
>>>                                         Thanks
>>>                                         -Sam
>>>
>>>                                         On Wed, Jul 2, 2014 at 5:03 AM,
>>>                                         Pierre BLONDEAU
>>>                                         <pierre.blondeau at unicaen.fr
>>>
>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>                                         wrote:
>>>
>>>
>>>
>>>                                             Yes, but how i do that ?
>>>
>>>                                             With a command like that ?
>>>
>>>                                             ceph tell osd.20 injectargs
>>>                                             '--debug-osd 20
>>>                                             --debug-filestore 20
>>>                                             --debug-ms
>>>                                             1'
>>>
>>>                                             By modify the
>>>                                             /etc/ceph/ceph.conf ? This
>>>                                             file is really poor
>>>                                             because I
>>>                                             use
>>>                                             udev detection.
>>>
>>>                                             When I have made these
>>>                                             changes, you want the three
>>>                                             log files or
>>>                                             only
>>>                                             osd.20's ?
>>>
>>>                                             Thank you so much for the
>>> help
>>>
>>>                                             Regards
>>>                                             Pierre
>>>
>>>                                             Le 01/07/2014 23:51, Samuel
>>>                                             Just a ?crit :
>>>
>>>                                                 Can you reproduce with
>>>                                                 debug osd = 20
>>>                                                 debug filestore = 20
>>>                                                 debug ms = 1
>>>                                                 ?
>>>                                                 -Sam
>>>
>>>                                                 On Tue, Jul 1, 2014 at
>>>                                                 1:21 AM, Pierre BLONDEAU
>>>
>>> <pierre.blondeau at unicaen.fr
>>>
>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>                                                 wrote:
>>>
>>>
>>>
>>>
>>>                                                     Hi,
>>>
>>>                                                     I join :
>>>                                                           - osd.20 is
>>>                                                     one of osd that I
>>>                                                     detect which makes
>>> crash
>>>                                                     other
>>>                                                     OSD.
>>>                                                           - osd.23 is
>>>                                                     one of osd which
>>>                                                     crash when i start
>>>                                                     osd.20
>>>                                                           - mds, is one
>>>                                                     of my MDS
>>>
>>>                                                     I cut log file
>>>                                                     because they are to
>>>                                                     big but. All is
>>> here :
>>>
>>> https://blondeau.users.greyc.__fr/cephlog/
>>>
>>> <https://blondeau.users.greyc.fr/cephlog/>
>>>
>>>                                                     Regards
>>>
>>>                                                     Le 30/06/2014 17:35,
>>>                                                     Gregory Farnum a
>>> ?crit :
>>>
>>>                                                         What's the
>>>                                                         backtrace from
>>>                                                         the crashing
>>> OSDs?
>>>
>>>                                                         Keep in mind
>>>                                                         that as a dev
>>>                                                         release, it's
>>>                                                         generally best
>>>                                                         not to
>>>                                                         upgrade
>>>                                                         to unnamed
>>>                                                         versions like
>>>                                                         0.82 (but it's
>>>                                                         probably too
>>> late
>>>                                                         to go
>>>                                                         back
>>>                                                         now).
>>>
>>>
>>>
>>>
>>>                                                     I will remember it
>>>                                                     the next time ;)
>>>
>>>                                                         -Greg
>>>                                                         Software
>>>                                                         Engineer #42 @
>>>
>>> http://inktank.com
>>>                                                         |
>>> http://ceph.com
>>>
>>>                                                         On Mon, Jun 30,
>>>                                                         2014 at 8:06 AM,
>>>                                                         Pierre BLONDEAU
>>>
>>> <pierre.blondeau at unicaen.fr
>>>
>>> <mailto:pierre.blondeau at unicaen.fr>>
>>>                                                         wrote:
>>>
>>>
>>>
>>>                                                             Hi,
>>>
>>>                                                             After the
>>>                                                             upgrade to
>>>                                                             firefly, I
>>>                                                             have some PG
>>>                                                             in peering
>>>                                                             state.
>>>                                                             I seen the
>>>                                                             output of
>>>                                                             0.82 so I
>>>                                                             try to
>>>                                                             upgrade for
>>>                                                             solved my
>>>                                                             problem.
>>>
>>>                                                             My three MDS
>>>                                                             crash and
>>>                                                             some OSD
>>>                                                             triggers a
>>>                                                             chain
>>> reaction
>>>                                                             that
>>>                                                             kills
>>>                                                             other
>>>                                                             OSD.
>>>                                                             I think my
>>>                                                             MDS will not
>>>                                                             start
>>>                                                             because of
>>>                                                             the
>>> metadata are
>>>                                                             on the
>>>                                                             OSD.
>>>
>>>                                                             I have 36
>>>                                                             OSD on three
>>>                                                             servers and
>>>                                                             I identified
>>>                                                             5 OSD which
>>>                                                             makes
>>>                                                             crash
>>>                                                             others. If i
>>>                                                             not start
>>>                                                             their, the
>>>                                                             cluster
>>> passe in
>>>
>>> reconstructive
>>>                                                             state
>>>                                                             with
>>>                                                             31 OSD but i
>>>                                                             have 378 in
>>>                                                             down+peering
>>>                                                             state.
>>>
>>>                                                             How can I do
>>>                                                             ? Would you
>>>                                                             more
>>>                                                             information
>>>                                                             ( os,
>>> crash log,
>>>                                                             etc ...
>>>                                                             )
>>>                                                             ?
>>>
>>>                                                             Regards
>>>
>>>
>>>
>>>
>>>                                     --
>>>
>>> ------------------------------__----------------
>>>                                     Pierre BLONDEAU
>>>                                     Administrateur Syst?mes & r?seaux
>>>                                     Universit? de Caen
>>>                                     Laboratoire GREYC, D?partement
>>>                                     d'informatique
>>>
>>>                                     tel     : 02 31 56 75 42
>>>                                     bureau  : Campus 2, Science 3, 406
>>>
>>> ------------------------------__----------------
>>>
>>>
>>>
>>>                         --
>>>                         ------------------------------__----------------
>>>                         Pierre BLONDEAU
>>>                         Administrateur Syst?mes & r?seaux
>>>                         Universit? de Caen
>>>                         Laboratoire GREYC, D?partement d'informatique
>>>
>>>                         tel     : 02 31 56 75 42
>>>                         bureau  : Campus 2, Science 3, 406
>>>                         ------------------------------__----------------
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>     --
>>>     ------------------------------__----------------
>>>     Pierre BLONDEAU
>>>     Administrateur Syst?mes & r?seaux
>>>     Universit? de Caen
>>>     Laboratoire GREYC, D?partement d'informatique
>>>
>>>     tel     : 02 31 56 75 42
>>>     bureau  : Campus 2, Science 3, 406
>>>     ------------------------------__----------------
>>>
>>>
>>
>>
>>
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users at lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>

-- 
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com