Re: Cluster not recovering after OSD deamon is down

Varada Kari <Varada.Kari@xxxxxxxxxxx> · Tue, 3 May 2016 13:30:15 +0000

Pgs are degraded because they don't have enough copies of the data. What
is your replication size?

You can refer to
http://docs.ceph.com/docs/master/rados/operations/pg-states/  for PG states.

Varada

On Tuesday 03 May 2016 06:56 PM, Gaurav Bafna wrote:
> Also , the old PGs are not mapped to the down osd as seen from the
> ceph health detail
>
> pg 5.72 is active+undersized+degraded, acting [16,49]
> pg 5.4e is active+undersized+degraded, acting [16,38]
> pg 5.32 is active+undersized+degraded, acting [39,19]
> pg 5.37 is active+undersized+degraded, acting [43,1]
> pg 5.2c is active+undersized+degraded, acting [47,18]
> pg 5.27 is active+undersized+degraded, acting [26,19]
> pg 6.13 is active+undersized+degraded, acting [30,16]
> pg 4.17 is active+undersized+degraded, acting [47,20]
> pg 7.a is active+undersized+degraded, acting [38,2]
>
> From pg query of 7.a
>
> {
>     "state": "active+undersized+degraded",
>     "snap_trimq": "[]",
>     "epoch": 857,
>     "up": [
>         38,
>         2
>     ],
>     "acting": [
>         38,
>         2
>     ],
>     "actingbackfill": [
>         "2",
>         "38"
>     ],
>     "info": {
>         "pgid": "7.a",
>         "last_update": "0'0",
>         "last_complete": "0'0",
>         "log_tail": "0'0",
>         "last_user_version": 0,
>         "last_backfill": "MAX",
>         "purged_snaps": "[]",
>         "history": {
>             "epoch_created": 13,
>             "last_epoch_started": 818,
>             "last_epoch_clean": 818,
>             "last_epoch_split": 0,
>             "same_up_since": 817,
>             "same_interval_since": 817,
>
>
> Complete pq query info at : http://pastebin.com/ZHB6M4PQ
>
> On Tue, May 3, 2016 at 6:46 PM, Gaurav Bafna <bafnag@xxxxxxxxx> wrote:
>> Thanks Tupper for replying.
>>
>> Shouldn't the PG be remapped to other OSDs ?
>>
>> Yes , removing OSD from the cluster is resulting into full recovery.
>> But that should not be needed , right ?
>>
>>
>>
>> On Tue, May 3, 2016 at 6:31 PM, Tupper Cole <tcole@xxxxxxxxxx> wrote:
>>> The degraded pgs are mapped to the down OSD and have not mapped to a new
>>> OSD. Removing the OSD would likely result in a full recovery.
>>>
>>> As a note, having two monitors (or any even number of monitors) is not
>>> recommended. If either monitor goes down you will lose quorum. The
>>> recommended number of monitors for any cluster is at least three.
>>>
>>> On Tue, May 3, 2016 at 8:42 AM, Gaurav Bafna <bafnag@xxxxxxxxx> wrote:
>>>> Hi Cephers,
>>>>
>>>> I am running a very small cluster of 3 storage and 2 monitor nodes.
>>>>
>>>> After I kill 1 osd-daemon, the cluster never recovers fully. 9 PGs
>>>> remain undersized for unknown reason.
>>>>
>>>> After I restart that 1 osd deamon, the cluster recovers in no time .
>>>>
>>>> Size of all pools are 3 and min_size is 2.
>>>>
>>>> Can anybody please help ?
>>>>
>>>> Output of  "ceph -s"
>>>>     cluster fac04d85-db48-4564-b821-deebda046261
>>>>      health HEALTH_WARN
>>>>             9 pgs degraded
>>>>             9 pgs stuck degraded
>>>>             9 pgs stuck unclean
>>>>             9 pgs stuck undersized
>>>>             9 pgs undersized
>>>>             recovery 3327/195138 objects degraded (1.705%)
>>>>             pool .users pg_num 512 > pgp_num 8
>>>>      monmap e2: 2 mons at
>>>> {dssmon2=10.140.13.13:6789/0,dssmonleader1=10.140.13.11:6789/0}
>>>>             election epoch 1038, quorum 0,1 dssmonleader1,dssmon2
>>>>      osdmap e857: 69 osds: 68 up, 68 in
>>>>       pgmap v106601: 896 pgs, 9 pools, 435 MB data, 65047 objects
>>>>             279 GB used, 247 TB / 247 TB avail
>>>>             3327/195138 objects degraded (1.705%)
>>>>                  887 active+clean
>>>>                    9 active+undersized+degraded
>>>>   client io 395 B/s rd, 0 B/s wr, 0 op/s
>>>>
>>>> ceph health detail output :
>>>>
>>>> HEALTH_WARN 9 pgs degraded; 9 pgs stuck degraded; 9 pgs stuck unclean;
>>>> 9 pgs stuck undersized; 9 pgs undersized; recovery 3327/195138 objects
>>>> degraded (1.705%); pool .users pg_num 512 > pgp_num 8
>>>> pg 7.a is stuck unclean for 322742.938959, current state
>>>> active+undersized+degraded, last acting [38,2]
>>>> pg 5.27 is stuck unclean for 322754.823455, current state
>>>> active+undersized+degraded, last acting [26,19]
>>>> pg 5.32 is stuck unclean for 322750.685684, current state
>>>> active+undersized+degraded, last acting [39,19]
>>>> pg 6.13 is stuck unclean for 322732.665345, current state
>>>> active+undersized+degraded, last acting [30,16]
>>>> pg 5.4e is stuck unclean for 331869.103538, current state
>>>> active+undersized+degraded, last acting [16,38]
>>>> pg 5.72 is stuck unclean for 331871.208948, current state
>>>> active+undersized+degraded, last acting [16,49]
>>>> pg 4.17 is stuck unclean for 331822.771240, current state
>>>> active+undersized+degraded, last acting [47,20]
>>>> pg 5.2c is stuck unclean for 323021.274535, current state
>>>> active+undersized+degraded, last acting [47,18]
>>>> pg 5.37 is stuck unclean for 323007.574395, current state
>>>> active+undersized+degraded, last acting [43,1]
>>>> pg 7.a is stuck undersized for 322487.284302, current state
>>>> active+undersized+degraded, last acting [38,2]
>>>> pg 5.27 is stuck undersized for 322487.287164, current state
>>>> active+undersized+degraded, last acting [26,19]
>>>> pg 5.32 is stuck undersized for 322487.285566, current state
>>>> active+undersized+degraded, last acting [39,19]
>>>> pg 6.13 is stuck undersized for 322487.287168, current state
>>>> active+undersized+degraded, last acting [30,16]
>>>> pg 5.4e is stuck undersized for 331351.476170, current state
>>>> active+undersized+degraded, last acting [16,38]
>>>> pg 5.72 is stuck undersized for 331351.475707, current state
>>>> active+undersized+degraded, last acting [16,49]
>>>> pg 4.17 is stuck undersized for 322487.280309, current state
>>>> active+undersized+degraded, last acting [47,20]
>>>> pg 5.2c is stuck undersized for 322487.286347, current state
>>>> active+undersized+degraded, last acting [47,18]
>>>> pg 5.37 is stuck undersized for 322487.280027, current state
>>>> active+undersized+degraded, last acting [43,1]
>>>> pg 7.a is stuck degraded for 322487.284340, current state
>>>> active+undersized+degraded, last acting [38,2]
>>>> pg 5.27 is stuck degraded for 322487.287202, current state
>>>> active+undersized+degraded, last acting [26,19]
>>>> pg 5.32 is stuck degraded for 322487.285604, current state
>>>> active+undersized+degraded, last acting [39,19]
>>>> pg 6.13 is stuck degraded for 322487.287207, current state
>>>> active+undersized+degraded, last acting [30,16]
>>>> pg 5.4e is stuck degraded for 331351.476209, current state
>>>> active+undersized+degraded, last acting [16,38]
>>>> pg 5.72 is stuck degraded for 331351.475746, current state
>>>> active+undersized+degraded, last acting [16,49]
>>>> pg 4.17 is stuck degraded for 322487.280348, current state
>>>> active+undersized+degraded, last acting [47,20]
>>>> pg 5.2c is stuck degraded for 322487.286386, current state
>>>> active+undersized+degraded, last acting [47,18]
>>>> pg 5.37 is stuck degraded for 322487.280066, current state
>>>> active+undersized+degraded, last acting [43,1]
>>>> pg 5.72 is active+undersized+degraded, acting [16,49]
>>>> pg 5.4e is active+undersized+degraded, acting [16,38]
>>>> pg 5.32 is active+undersized+degraded, acting [39,19]
>>>> pg 5.37 is active+undersized+degraded, acting [43,1]
>>>> pg 5.2c is active+undersized+degraded, acting [47,18]
>>>> pg 5.27 is active+undersized+degraded, acting [26,19]
>>>> pg 6.13 is active+undersized+degraded, acting [30,16]
>>>> pg 4.17 is active+undersized+degraded, acting [47,20]
>>>> pg 7.a is active+undersized+degraded, acting [38,2]
>>>> recovery 3327/195138 objects degraded (1.705%)
>>>> pool .users pg_num 512 > pgp_num 8
>>>>
>>>>
>>>> My crush map is default.
>>>>
>>>> Ceph.conf is :
>>>>
>>>> [osd]
>>>> osd mkfs type=xfs
>>>> osd recovery threads=2
>>>> osd disk thread ioprio class=idle
>>>> osd disk thread ioprio priority=7
>>>> osd journal=/var/lib/ceph/osd/ceph-$id/journal
>>>> filestore flusher=False
>>>> osd op num shards=3
>>>> debug osd=5
>>>> osd disk threads=2
>>>> osd data=/var/lib/ceph/osd/ceph-$id
>>>> osd op num threads per shard=5
>>>> osd op threads=4
>>>> keyring=/var/lib/ceph/osd/ceph-$id/keyring
>>>> osd journal size=4096
>>>>
>>>>
>>>> [global]
>>>> filestore max sync interval=10
>>>> auth cluster required=cephx
>>>> osd pool default min size=3
>>>> osd pool default size=3
>>>> public network=10.140.13.0/26
>>>> objecter inflight op_bytes=1073741824
>>>> auth service required=cephx
>>>> filestore min sync interval=1
>>>> fsid=fac04d85-db48-4564-b821-deebda046261
>>>> keyring=/etc/ceph/keyring
>>>> cluster network=10.140.13.0/26
>>>> auth client required=cephx
>>>> filestore xattr use omap=True
>>>> max open files=65536
>>>> objecter inflight ops=2048
>>>> osd pool default pg num=512
>>>> log to syslog = true
>>>> #err to syslog = true
>>>>
>>>>
>>>> --
>>>> Gaurav Bafna
>>>> 9540631400
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@xxxxxxxxxxxxxx
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>>
>>>
>>> --
>>>
>>> Thanks,
>>> Tupper Cole
>>> Senior Storage Consultant
>>> Global Storage Consulting, Red Hat
>>> tcole@xxxxxxxxxx
>>> phone:  + 01 919-720-2612
>>
>>
>> --
>> Gaurav Bafna
>> 9540631400
>
>

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com