Re: [Urgent] Ceph system Down, Ceph FS volume in recovering

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


It looks like you have quite a few problems I’ll try and address them one by one. 

1) Looks like you had a bunch of crashes, from the ceph -s it looks like you don’t have enough MDS daemons running for a quorum. So you’ll need to restart the crashed containers. 

2) It looks like you might have an interesting crush map. Allegedly you have 41TiB of space but you can’t finish rococering you have lots of PGs stuck as their destination is too full. Are you running homogenous hardware or do you have different drive sizes? Are all the weights set correctly?

One you correct item 1 you’ll need to correct item 2 to get back to a healthy spot. 

Sent from Bloomberg Professional for iPhone

----- Original Message -----
From: nguyenvandiep@xxxxxxxxxxxxxx
To: ceph-users@xxxxxxx
At: 02/24/24 09:01:22 UTC

Hi Mathew

Pls chekc my ceph -s

ceph -s
    id:     258af72a-cff3-11eb-a261-d4f5ef25154c
    health: HEALTH_WARN
            3 failed cephadm daemon(s)
            1 filesystem is degraded
            insufficient standby MDS daemons available
            1 nearfull osd(s)
            Low space hindering backfill (add storage if this doesn't resolve itself):
21 pgs backfill_toofull
            15 pool(s) nearfull
            11 daemons have recently crashed

    mon:         6 daemons, quorum
cephgw03,cephosd01,cephgw01,cephosd03,cephgw02,cephosd02 (age 30h)
    mgr:         cephgw01.vwoffq(active, since 17h), standbys: cephgw02.nauphz,
    mds:         1/1 daemons up
    osd:         29 osds: 29 up (since 40h), 29 in (since 29h); 402 remapped pgs
    rgw:         2 daemons active (2 hosts, 1 zones)
    tcmu-runner: 18 daemons active (2 hosts)

    volumes: 0/1 healthy, 1 recovering
    pools:   15 pools, 1457 pgs
    objects: 36.87M objects, 25 TiB
    usage:   75 TiB used, 41 TiB / 116 TiB avail
    pgs:     17759672/110607480 objects misplaced (16.056%)
             1055 active+clean
             363  active+remapped+backfill_wait
             18   active+remapped+backfilling
             14   active+remapped+backfill_toofull
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]

  Powered by Linux