MDSs report damaged metadata

Vadim Bulst <vadim.bulst@xxxxxxxxxxxxxx> · Mon, 11 Oct 2021 10:49:15 +0200

Good morning to everybody,

I run into a problem where inodes are not updated in journal backlog and 
scrubbing plus repair is not removing old infos.

Infos about my Ceph-installation:

 * version Pacific 16.2.5
 * 6 nodes
 * 48 OSDs
 * one active MDS
 * one standby-replay MDS
 * one standby MDS
 * one CephFS Pool on spindles for data - cephfs_data - usage 6%
 * one CephFS pool on nvme for metadata - cephfs_meta - usage <1%
 * mounted via kernel-client:
     o version Ubuntu 5.11.22
     o version Centos 3.10.0-1160.42.2.el7

We are using CephFS to hold state files of our Slurm queuing system to 
keep the masters in sync. These files are somehow leading to backtrace 
errors which will not be repaired by scrubbing plus repair and force 
using following command:

ceph tell mds.scfs:0 scrub start / recursive repair force

These clients are on CentOS 7 mounted via kernel client. I run into this 
problem while a failover the active MDS service from one to the 
standby-replay . I also discovered that when now do a failover again the 
error is disappearing for some hours till these files are written again.

If I do a "damage ls" these files get mentioned:

root@scvirt01:/home/urzadmin# ceph tell mds.0 damage ls
2021-10-11T09:27:52.286+0200 7fa8117fa700  0 client.395589209 
ms_handle_reset on v2:172.26.8.153:6800/3237390256
2021-10-11T09:27:52.306+0200 7fa8117fa700  0 client.395589215 
ms_handle_reset on v2:172.26.8.153:6800/3237390256
[
{
        "damage_type": "backtrace",
        "id": 389005317,
        "ino": 1099539016806,
        "path": "/slurmstate_galaxy/state/qos_usage.old"
},
{
        "damage_type": "backtrace",
        "id": 784942402,
        "ino": 1099539034091,
        "path": "/slurmstate_galaxy/state/trigger_state.old"
},
{
        "damage_type": "backtrace",
        "id": 800422439,
        "ino": 1099539016280,
        "path": "/slurmstate_galaxy/state/priority_last_decay_ran.old"
},
{
        "damage_type": "backtrace",
        "id": 1096079557,
        "ino": 1099539034095,
        "path": "/slurmstate_galaxy/state/fed_mgr_state.old"
},
{
        "damage_type": "backtrace",
        "id": 1478025581,
        "ino": 1099539034678,
        "path": "/slurmstate_galaxy/state/heartbeat"
},
{
        "damage_type": "backtrace",
        "id": 1850571320,
        "ino": 1099539034090,
        "path": "/slurmstate_galaxy/state/resv_state.old"
},
{
       "damage_type": "backtrace",
       "id": 2374363174,
       "ino": 1099539016807,
       "path": "/slurmstate_galaxy/state/fed_mgr_state.old"
},
{
       "damage_type": "backtrace",
       "id": 2476062375,
       "ino": 1099539034092,
       "path": "/slurmstate_galaxy/state/assoc_mgr_state.old"
},
{
       "damage_type": "backtrace",
       "id": 2615211078,
       "ino": 1099539034088,
       "path": "/slurmstate_galaxy/state/node_state.old"
},
{
       "damage_type": "backtrace",
       "id": 2872809546,
       "ino": 1099539016538,
       "path": "/slurmstate_galaxy/state/priority_last_decay_ran.old"
},
{
       "damage_type": "backtrace",
       "id": 2952984622,
       "ino": 1099539034094,
       "path": "/slurmstate_galaxy/state/qos_usage.old"
},
{
       "damage_type": "backtrace",
       "id": 3048617909,
       "ino": 1099539017073,
       "path": "/slurmstate_galaxy/state/trigger_state.old"
},
    {
        "damage_type": "backtrace",
        "id": 4027167458,
        "ino": 1099539035485,
        "path": "/slurmstate_galaxy/state/heartbeat"
    },
    {
        "damage_type": "backtrace",
        "id": 4094349452,
        "ino": 1099539034093,
        "path": "/slurmstate_galaxy/state/assoc_usage.old"
    },
    {
        "damage_type": "backtrace",
        "id": 4274997805,
        "ino": 1099539034089,
        "path": "/slurmstate_galaxy/state/part_state.old"
    }
]

When i do a ls on the client node the inodes and time stamps are all the 
same and correct:

[root@galaxymaster01 state]# ls -ila
total 692
1099511628781 drwxr-xr-x 1 svcslurm root             42 Oct 11 09:49 .
1099511627776 drwxr-xr-x 1 root     root              4 Apr  1 2021 ..
1099539038993 -rw------- 1 svcslurm domain users  48483 Oct 11 09:45 
assoc_mgr_state
1099539038962 -rw------- 1 svcslurm domain users  48483 Oct 11 09:40 
assoc_mgr_state.old
1099539038996 -rw------- 1 svcslurm domain users  14366 Oct 11 09:45 
assoc_usage
1099539038963 -rw------- 1 svcslurm domain users  14366 Oct 11 09:40 
assoc_usage.old
1099521678639 -rw-r--r-- 1 svcslurm domain users      7 Apr  1 2021 
clustername
1099511628783 -rw-r--r-- 1 svcslurm domain users      7 Mar 26 2018 
clustername_bkp
2199023255934 -rw-r--r-- 1 svcslurm domain users      7 Apr  1 2021 
clustername_bkp2
1099536954641 -rw-r--r-- 1 svcslurm domain users      0 Apr  1 2021 
clustername_bkp3
1099538397169 -rw------- 1 svcslurm domain users    211 Sep 11 14:36 
dbd.messages
1099539039000 -rw------- 1 svcslurm domain users     19 Oct 11 09:45 
fed_mgr_state
1099539038967 -rw------- 1 svcslurm domain users     19 Oct 11 09:40 
fed_mgr_state.old
1099511628789 drwxr----- 1 svcslurm domain users     11 Oct 11 09:34 hash.0
1099511631702 drwxr----- 1 svcslurm domain users      9 Oct 11 09:48 hash.1
1099511634616 drwxr----- 1 svcslurm domain users      9 Oct 11 09:39 hash.2
1099511637533 drwxr----- 1 svcslurm domain users     14 Oct 11 09:39 hash.3
1099511640454 drwxr----- 1 svcslurm domain users      9 Oct 11 09:39 hash.4
1099511643373 drwxr----- 1 svcslurm domain users      8 Oct 11 09:44 hash.5
1099511646293 drwxr----- 1 svcslurm domain users     14 Oct 11 09:44 hash.6
1099511649213 drwxr----- 1 svcslurm domain users     15 Oct 11 09:45 hash.7
1099511652131 drwxr----- 1 svcslurm domain users     11 Oct 11 09:45 hash.8
1099511655048 drwxr----- 1 svcslurm domain users     13 Oct 11 09:30 hash.9
1099539039021 -rw------- 1 svcslurm domain users     16 Oct 11 09:49 
heartbeat
1099539039011 -rw------- 1 svcslurm domain users 264068 Oct 11 09:46 
job_state
1099539039006 -rw------- 1 svcslurm domain users 260650 Oct 11 09:45 
job_state.old
1099538757056 -rw------- 1 svcslurm domain users     42 Sep 15 08:47 
last_config_lite
1099538397170 -rw------- 1 svcslurm domain users     42 Sep 11 14:44 
last_config_lite.old
1099539038991 -rw------- 1 svcslurm domain users    451 Oct 11 09:45 
last_tres
1099539038959 -rw------- 1 svcslurm domain users    451 Oct 11 09:40 
last_tres.old
1099521425633 -rw------- 1 svcslurm domain users   2874 Jun  8 2020 
layouts_state_base
1099521425597 -rw------- 1 svcslurm domain users   2874 Jun  8 2020 
layouts_state_base.old
1099539039012 -rw------- 1 svcslurm domain users  17925 Oct 11 09:46 
node_state
1099539039007 -rw------- 1 svcslurm domain users  17925 Oct 11 09:45 
node_state.old
1099539038995 -rw------- 1 svcslurm domain users   1018 Oct 11 09:45 
part_state
1099539038964 -rw------- 1 svcslurm domain users   1018 Oct 11 09:40 
part_state.old
1099539039016 -rw------- 1 svcslurm domain users     16 Oct 11 09:47 
priority_last_decay_ran
1099539038974 -rw------- 1 svcslurm domain users     16 Oct 11 09:42 
priority_last_decay_ran.old
1099539038998 -rw------- 1 svcslurm domain users    796 Oct 11 09:45 
qos_usage
1099539038965 -rw------- 1 svcslurm domain users    796 Oct 11 09:40 
qos_usage.old
1099539038997 -rw------- 1 svcslurm domain users     35 Oct 11 09:45 
resv_state
1099539038966 -rw------- 1 svcslurm domain users     35 Oct 11 09:40 
resv_state.old
1099539038999 -rw------- 1 svcslurm domain users     31 Oct 11 09:45 
trigger_state
1099539038968 -rw------- 1 svcslurm domain users     31 Oct 11 09:40 
trigger_state.old

So - my first question - is it save to remove the damage entries? Via:

ceph tell mds.$filesystem:0 damage rm $id

My second question - can i do something no to run in this error again. 
Perhaps switch to fuse client.

Thanks in advance!

Cheers,

Vadim

--
Vadim Bulst

Universität Leipzig / URZ
04109  Leipzig, Augustusplatz 10

phone:   +49-341-97-33380
mail:vadim.bulst@xxxxxxxxxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx