Re: Cephfs metadta pool suddenly full (100%) !

Hervé Ballans <herve.ballans@xxxxxxxxxxxxx> · Tue, 1 Jun 2021 13:15:32 +0200

Hi again,

Sorry, I realize that I didn't add some ouputs from useful ceph commands.

# ceph status
  cluster:
    id:     838506b7-e0c6-4022-9e17-2d1cf9458be6
    health: HEALTH_ERR
            1 filesystem is degraded
            3 full osd(s)
            1 pool(s) full
            1 daemons have recently crashed

  services:
    mon: 3 daemons, quorum inf-ceph-mon0,inf-ceph-mon1,inf-ceph-mon2 
(age 7w)
    mgr: inf-ceph-mon2(active, since 9w), standbys: inf-ceph-mon1, 
inf-ceph-mon0
    mds: cephfs_home:2/2 
{0=inf-ceph-mon2=up:replay,1=inf-ceph-mon1=up:replay} 1 up:standby
    osd: 126 osds: 126 up (since 5m), 126 in (since 5M)

  task status:
    scrub status:
        mds.inf-ceph-mon1: idle
        mds.inf-ceph-mon2: idle

  data:
    pools:   3 pools, 1664 pgs
    objects: 29.90M objects, 31 TiB
    usage:   104 TiB used, 105 TiB / 210 TiB avail
    pgs:     1662 active+clean
             2    active+clean+scrubbing+deep

  io:
    client:   251 MiB/s rd, 4.8 MiB/s wr, 100 op/s rd, 160 op/s wr

# ceph health detail
HEALTH_ERR 1 filesystem is degraded; 3 full osd(s); 1 pool(s) full; 1 
daemons have recently crashed
FS_DEGRADED 1 filesystem is degraded
    fs cephfs_home is degraded
OSD_FULL 3 full osd(s)
    osd.120 is full
    osd.121 is full
    osd.122 is full
POOL_FULL 1 pool(s) full
    pool 'cephfs_metadata_home' is full (no space)
RECENT_CRASH 1 daemons have recently crashed
    mds.inf-ceph-mon2 crashed on host inf-ceph-mon2 at 2021-06-01 
08:18:33.503311Z

Thanks a lot if you have some ways for trying to solve this...

Hervé

Le 01/06/2021 à 12:24, Hervé Ballans a écrit :
Hi all,

Ceph  Nautilus 14.2.16.

We encounter a strange and critical poblem since this morning.

Our cephfs metadata pool suddenly grew from 2,7% to 100%! (in less 
than 5 hours) while there is no significant activities on the OSD data !

Here are some numbers:

# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED %RAW USED
    hdd       205 TiB     103 TiB     102 TiB      102 TiB 49.68
    nvme      4.4 TiB     2.2 TiB     2.1 TiB      2.2 TiB 49.63
    TOTAL     210 TiB     105 TiB     104 TiB      104 TiB 49.68

POOLS:
    POOL                     ID     PGS      STORED OBJECTS 
USED        %USED      MAX AVAIL
    cephfs_data_home          7      512      11 TiB 22.58M 11 
TiB      18.31        17 TiB
    cephfs_metadata_home      8      128     724 GiB 2.32M     724 
GiB     100.00           0 B
    rbd_backup_vms            9     1024      19 TiB 5.00M      19 
TiB      37.08        11 TiB

The cephfs_data uses less than the half of the storage space, and 
there was no significant increase during the period (and before) where 
metadata became full.

Is someone already encounter that ?

Currently, I have no idea how I can solve this problem. The restart of 
associated OSD and mds services have not been useful.

Let me know if you want more informations or logs.

Thank you for your help.

Regards,
Hervé

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx