Re: PG in state: creating+down

Thomas Schneider <74cmonty@xxxxxxxxx> · Fri, 15 Nov 2019 13:29:09 +0100

This cluster has a long unhealthy story, means this issue is not
happening out of the blue.

root@ld3955:~# ceph -s
  cluster:
    id:     6b1b5117-6e08-4843-93d6-2da3cf8a6bae
    health: HEALTH_WARN
            1 MDSs report slow metadata IOs
            noscrub,nodeep-scrub flag(s) set
            Reduced data availability: 1 pg inactive, 1 pg down
            1 subtrees have overcommitted pool target_size_bytes
            1 subtrees have overcommitted pool target_size_ratio
            18 slow requests are blocked > 32 sec
            mons ld5505,ld5506 are low on available space

  services:
    mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 2h)
    mgr: ld5507(active, since 28h), standbys: ld5506, ld5505
    mds: cephfs:1 {0=ld4465=up:active} 1 up:standby
    osd: 441 osds: 438 up, 438 in
         flags noscrub,nodeep-scrub

  data:
    pools:   6 pools, 8432 pgs
    objects: 63.28M objects, 241 TiB
    usage:   723 TiB used, 796 TiB / 1.5 PiB avail
    pgs:     0.012% pgs not active
             8431 active+clean
             1    creating+down

  io:
    client:   33 MiB/s rd, 14.20k op/s rd, 0 op/s wr

Am 15.11.2019 um 13:24 schrieb Wido den Hollander:
>
> On 11/15/19 11:22 AM, Thomas Schneider wrote:
>> Hi,
>> ceph health is reporting: pg 59.1c is creating+down, acting [426,438]
>>
>> root@ld3955:~# ceph health detail
>> HEALTH_WARN 1 MDSs report slow metadata IOs; noscrub,nodeep-scrub
>> flag(s) set; Reduced data availability: 1 pg inactive, 1 pg down; 1
>> subtrees have overcommitted pool target_size_bytes; 1 subtrees have
>> overcommitted pool target_size_ratio; mons ld5505,ld5506 are low on
>> available space
>> MDS_SLOW_METADATA_IO 1 MDSs report slow metadata IOs
>>     mdsld4465(mds.0): 8 slow metadata IOs are blocked > 30 secs, oldest
>> blocked for 120721 secs
>> OSDMAP_FLAGS noscrub,nodeep-scrub flag(s) set
>> PG_AVAILABILITY Reduced data availability: 1 pg inactive, 1 pg down
>>     pg 59.1c is creating+down, acting [426,438]
>> MON_DISK_LOW mons ld5505,ld5506 are low on available space
>>     mon.ld5505 has 22% avail
>>     mon.ld5506 has 29% avail
>>
>> root@ld3955:~# ceph pg dump_stuck inactive
>> ok
>> PG_STAT STATE         UP        UP_PRIMARY ACTING    ACTING_PRIMARY
>> 59.1c   creating+down [426,438]        426 [426,438]            426
>>
>> How can I fix this?
> Did you change anything to the cluster?
>
> Can you share this output:
>
> $ ceph status
>
> As there seems that more things are wrong with this system. This doesn't
> happen out of the blue. Something must have happened.
>
> Wido
>
>> THX
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@xxxxxxx
>> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx