Re: Cluster stuck in failed state after power failure - please help

Jan Pekař - Imatic <jan.pekar@xxxxxxxxx> · Mon, 11 Dec 2017 20:07:12 +0100

Hi,

thank you for response. I started mds manually and accessed cephfs, I'm 
not running mgr yet, it is not necessary.
I just responded to mailing list. It looks, that dump from ceph is 
incorrect and cluster is "working somehow". So problem is different, 
that my mgr or mds is not running.

With regards
Jan Pekar

On 11.12.2017 19:42, David Turner wrote:
It honestly just looks like your MDS and MGR daemons are not configured 
to start automatically.  Try starting them manually and then if that 
fixes the things, go through and enable them to start automatically.  
Assuming you use systemctl the commands to check and fix this would be 
something like these.  The first one will show you all of the things 
started with ceph.target.

sudo systemctl list-dependencies ceph.target
sudo systemctl enable ceph-mgr@servername
sudo systemctl enable ceph-mds@servername

On Mon, Dec 11, 2017 at 1:08 PM Jan Pekař - Imatic <jan.pekar@xxxxxxxxx 
<mailto:jan.pekar@xxxxxxxxx>> wrote:

    Hi all,

    hope that somebody can help me. I have home ceph installation.
    After power failure (it can happen in datacenter also) my ceph booted in
    non-consistent state.

    I was backfilling data on one new disk during power failure. First time
    it booted without some OSDs, but I fixed that. Now I have all my OSD's
    running, but cluster state looks like this after some time. :

        cluster:
          id:     2d9bf17f-3d50-4a59-8359-abc8328fe801
          health: HEALTH_WARN
                  1 filesystem is degraded
                  1 filesystem has a failed mds daemon
                  noout,nodeep-scrub flag(s) set
                  no active mgr
                  317162/12520262 objects misplaced (2.533%)
                  Reduced data availability: 52 pgs inactive, 29 pgs down, 1
    pg peering, 1 pg stale
                  Degraded data redundancy: 2099528/12520262 objects
    degraded
    (16.769%), 427 pgs unclean, 368 pgs degraded, 368 pgs undersized
                  1/3 mons down, quorum imatic-mce-2,imatic-mce

        services:
          mon: 3 daemons, quorum imatic-mce-2,imatic-mce, out of quorum:
    obyvak
          mgr: no daemons active
          mds: cephfs-0/1/1 up , 1 failed
          osd: 8 osds: 8 up, 8 in; 61 remapped pgs
               flags noout,nodeep-scrub

        data:
          pools:   8 pools, 896 pgs
          objects: 4446k objects, 9119 GB
          usage:   9698 GB used, 2290 GB / 11988 GB avail
          pgs:     2.455% pgs unknown
                   3.348% pgs not active
                   2099528/12520262 objects degraded (16.769%)
                   317162/12520262 objects misplaced (2.533%)
                   371 stale+active+clean
                   183 active+undersized+degraded
                   154 stale+active+undersized+degraded
                   85  active+clean
                   22  unknown
                   19  stale+down
                   14 
    stale+active+undersized+degraded+remapped+backfill_wait
                   13  active+undersized+degraded+remapped+backfill_wait
                   10  down
                   6   active+clean+remapped
                   6   stale+active+clean+remapped
                   5   stale+active+remapped+backfill_wait
                   2   active+remapped+backfill_wait
                   2   stale+active+undersized+degraded+remapped+backfilling
                   1   active+undersized+degraded+remapped
                   1   active+undersized+degraded+remapped+backfilling
                   1   stale+peering
                   1   stale+active+clean+scrubbing

    There are all OSD's up and running. Before that I completed
    ceph osd out
    on one of my disk and removed that disk from cluster because I don't
    want to use it anymore. It triggered crush reweight and started to
    rebuild my date. I thinkg that should not put my data in danger even I
    saw that some of my PG's were undersized (why?) - but it is not now the
    think.

    When I try to do
    ceph pg dump
    I have no response.

    But ceph osd dump show weird number of osd's on temporary PG's like
    number 2147483647 <tel:(214)%20748-3647>. I thing that there is some
    problem in some mon or
    other database and peering process cannot complete.

    What can I do next? I believed that cluster so much, so I have some data
    I want back. Thank you very much. for help.

    My ceph osd dump looks like this:

    epoch 29442
    fsid 2d9bf17f-3d50-4a59-8359-abc8328fe801
    created 2014-12-10 23:00:49.140787
    modified 2017-12-11 18:54:01.134091
    flags noout,nodeep-scrub,sortbitwise,recovery_deletes
    crush_version 14
    full_ratio 0.97
    backfillfull_ratio 0.91
    nearfull_ratio 0.9
    require_min_compat_client firefly
    min_compat_client firefly
    require_osd_release luminous
    pool 0 'data' replicated size 2 min_size 1 crush_rule 0 object_hash
    rjenkins pg_num 64 pgp_num 64 last_change 27537 flags hashpspool
    crash_replay_interval 45 min_read_recency_for_promote 1
    min_write_recency_for_promote 1 stripe_width 0 application cephfs
    pool 1 'metadata' replicated size 3 min_size 1 crush_rule 1 object_hash
    rjenkins pg_num 64 pgp_num 64 last_change 27537 flags hashpspool
    min_read_recency_for_promote 1 min_write_recency_for_promote 1
    stripe_width 0 application cephfs
    pool 2 'rbd' replicated size 2 min_size 1 crush_rule 0 object_hash
    rjenkins pg_num 64 pgp_num 64 last_change 28088 flags hashpspool
    min_read_recency_for_promote 1 min_write_recency_for_promote 1
    stripe_width 0 application rbd
              removed_snaps [1~5]
    pool 3 'nonreplicated' replicated size 1 min_size 1 crush_rule 2
    object_hash rjenkins pg_num 192 pgp_num 192 last_change 27537 flags
    hashpspool min_read_recency_for_promote 1 min_write_recency_for_promote
    1 stripe_width 0 application cephfs
    pool 4 'replicated' replicated size 2 min_size 1 crush_rule 0
    object_hash rjenkins pg_num 192 pgp_num 192 last_change 27537 lfor
    17097/17097 flags hashpspool min_read_recency_for_promote 1
    min_write_recency_for_promote 1 stripe_width 0 application cephfs
    pool 10 'erasure_3_1' erasure size 4 min_size 3 crush_rule 3 object_hash
    rjenkins pg_num 128 pgp_num 128 last_change 27537 lfor 9127/9127 flags
    hashpspool tiers 11 read_tier 11 write_tier 11
    min_write_recency_for_promote 1 stripe_width 4128 application cephfs
    pool 11 'erasure_3_1_hot' replicated size 2 min_size 1 crush_rule 1
    object_hash rjenkins pg_num 128 pgp_num 128 last_change 9910 flags
    hashpspool,incomplete_clones tier_of 10 cache_mode writeback
    target_bytes 5368709120 hit_set bloom{false_positive_probability: 0.05,
    target_size: 0, seed: 0} 0s x0 decay_rate 0 search_last_n 1
    min_write_recency_for_promote 1 stripe_width 0
    pool 12 'test' replicated size 1 min_size 1 crush_rule 4 object_hash
    rjenkins pg_num 64 pgp_num 64 last_change 27463 flags hashpspool
    stripe_width 0
    max_osd 8
    osd.0 up   in  weight 1 up_from 29416 up_thru 29433 down_at 29407
    last_clean_interval [29389,29406) 192.168.11.165:6800/9273
    <http://192.168.11.165:6800/9273>
    192.168.11.165:6801/9273 <http://192.168.11.165:6801/9273>
    192.168.11.165:6802/9273 <http://192.168.11.165:6802/9273>
    192.168.11.165:6803/9273 <http://192.168.11.165:6803/9273> exists,up
    630fe0dc-9ec0-456a-bf15-51d6d3ba462d
    osd.1 up   in  weight 1 up_from 29422 up_thru 29437 down_at 29407
    last_clean_interval [29390,29406) 192.168.11.165:6816/9336
    <http://192.168.11.165:6816/9336>
    192.168.11.165:6817/9336 <http://192.168.11.165:6817/9336>
    192.168.11.165:6818/9336 <http://192.168.11.165:6818/9336>
    192.168.11.165:6819/9336 <http://192.168.11.165:6819/9336> exists,up
    ef583c8d-171f-47c4-8a9e-e9eb913cb272
    osd.2 up   in  weight 1 up_from 29409 up_thru 29433 down_at 29407
    last_clean_interval [29389,29406) 192.168.11.165:6804/9285
    <http://192.168.11.165:6804/9285>
    192.168.11.165:6805/9285 <http://192.168.11.165:6805/9285>
    192.168.11.165:6806/9285 <http://192.168.11.165:6806/9285>
    192.168.11.165:6807/9285 <http://192.168.11.165:6807/9285> exists,up
    1de26ef5-319d-426e-ad75-65aedbbd0328
    osd.3 up   in  weight 1 up_from 29430 up_thru 29439 down_at 29410
    last_clean_interval [29391,29406) 192.168.11.165:6824/12146
    <http://192.168.11.165:6824/12146>
    192.168.11.165:6825/12146 <http://192.168.11.165:6825/12146>
    192.168.11.165:6826/12146 <http://192.168.11.165:6826/12146>
    192.168.11.165:6827/12146 <http://192.168.11.165:6827/12146>
    exists,up 5b63a084-cb0c-4e6a-89c1-9d2fc70cea02
    osd.4 up   in  weight 1 up_from 29442 up_thru 0 down_at 29347
    last_clean_interval [29317,29343) 192.168.11.165:6828/15193
    <http://192.168.11.165:6828/15193>
    192.168.11.165:6829/15193 <http://192.168.11.165:6829/15193>
    192.168.11.165:6830/15193 <http://192.168.11.165:6830/15193>
    192.168.11.165:6831/15193 <http://192.168.11.165:6831/15193>
    exists,up ee9d758d-f2df-41b6-9320-ce89f54c116b
    osd.5 up   in  weight 1 up_from 29414 up_thru 29431 down_at 29413
    last_clean_interval [29390,29406) 192.168.11.165:6812/9321
    <http://192.168.11.165:6812/9321>
    192.168.11.165:6813/9321 <http://192.168.11.165:6813/9321>
    192.168.11.165:6814/9321 <http://192.168.11.165:6814/9321>
    192.168.11.165:6815/9321 <http://192.168.11.165:6815/9321> exists,up
    d1077d42-2c92-4afd-a11e-02fdd59b393b
    osd.6 up   in  weight 1 up_from 29413 up_thru 29433 down_at 29407
    last_clean_interval [29390,29406) 192.168.11.165:6820/9345
    <http://192.168.11.165:6820/9345>
    192.168.11.165:6821/9345 <http://192.168.11.165:6821/9345>
    192.168.11.165:6822/9345 <http://192.168.11.165:6822/9345>
    192.168.11.165:6823/9345 <http://192.168.11.165:6823/9345> exists,up
    f55da9e5-0c03-43fa-af59-56add845c706
    osd.7 up   in  weight 1 up_from 29422 up_thru 29433 down_at 29407
    last_clean_interval [29389,29406) 192.168.11.165:6808/9309
    <http://192.168.11.165:6808/9309>
    192.168.11.165:6809/9309 <http://192.168.11.165:6809/9309>
    192.168.11.165:6810/9309 <http://192.168.11.165:6810/9309>
    192.168.11.165:6811/9309 <http://192.168.11.165:6811/9309> exists,up
    1e75647b-a1fc-4672-957f-ce5c2b0f4a43
    pg_temp 0.0 [0,5]
    pg_temp 0.2c [0,6]
    pg_temp 0.31 [7,5]
    pg_temp 0.33 [0,2]
    pg_temp 1.3 [1,6,5]
    pg_temp 1.1d [1,3,2]
    pg_temp 1.27 [6,5,2]
    pg_temp 1.2c [3,1,5]
    pg_temp 1.36 [6,3,2]
    pg_temp 1.37 [6,1,5]
    pg_temp 1.38 [6,3,1]
    pg_temp 1.3f [3,5,1]
    pg_temp 4.0 [3,6]
    pg_temp 4.4 [0,6]
    pg_temp 4.9 [0,7]
    pg_temp 4.13 [6,3]
    pg_temp 4.1a [6,5]
    pg_temp 4.41 [6,2]
    pg_temp 4.4b [0,1]
    pg_temp 4.5c [6,0]
    pg_temp 4.76 [0,6]
    pg_temp 4.87 [0,7]
    pg_temp 4.9d [0,3]
    pg_temp 4.a9 [6,7]
    pg_temp 10.1 [0,7,2,1]
    pg_temp 10.2 [3,0,1,7]
    pg_temp 10.5 [2,6,3,0]
    pg_temp 10.7 [6,7,0,3]
    pg_temp 10.a [0,3,5,6]
    pg_temp 10.c [7,6,5,0]
    pg_temp 10.f [7,3,6,0]
    pg_temp 10.1c [6,1,0,3]
    pg_temp 10.25 [0,7,5,3]
    pg_temp 10.26 [3,2,5,1]
    pg_temp 10.29 [7,5,2,0]
    pg_temp 10.2f [1,5,7,0]
    pg_temp 10.3b [7,3,0,6]
    pg_temp 10.41 [0,1,3,7]
    pg_temp 10.47 [3,0,7,5]
    pg_temp 10.4c [7,3,5,1]
    pg_temp 10.51 [1,7,0,2]
    pg_temp 10.54 [3,0,7,5]
    pg_temp 10.55 [7,1,0,3]
    pg_temp 10.5a [7,0,3,2]
    pg_temp 10.5b [2147483647 <tel:(214)%20748-3647>,0,5,6]
    pg_temp 10.5e [7,2,3,0]
    pg_temp 10.5f [6,3,2,1]
    pg_temp 10.63 [7,5,3,2]
    pg_temp 10.64 [3,6,7,1]
    pg_temp 10.66 [0,1,5,7]
    pg_temp 10.6c [7,3,2,0]
    pg_temp 10.6f [6,3,2,7]
    pg_temp 10.70 [0,7,6,1]
    pg_temp 10.72 [7,6,1,3]
    pg_temp 10.73 [6,2147483647 <tel:(214)%20748-3647>,7,0]
    pg_temp 10.74 [7,1,6,3]
    pg_temp 10.7c [0,6,1,7]
    pg_temp 10.7f [0,7,1,2]
    pg_temp 11.27 [6,5]
    pg_temp 11.38 [3,6]
    pg_temp 11.78 [3,1]

    Thank you all for help. It is important for me.
    Jan Pekar

    _______________________________________________
    ceph-users mailing list
    ceph-users@xxxxxxxxxxxxxx <mailto:ceph-users@xxxxxxxxxxxxxx>
    http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
============
Ing. Jan Pekař
jan.pekar@xxxxxxxxx | +420603811737
----
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
============
--
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com