Re: [Help appreciated] ceph mds damaged

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Justin,

On Tue, May 23, 2023 at 4:55 PM Justin Li <justin.li@xxxxxxxxxxxxx> wrote:
>
> Dear All,
>
> After a unsuccessful upgrade to pacific, MDS were offline and could not get back on. Checked the MDS log and found below. See cluster info from below as well. Appreciate it if anyone can point me to the right direction. Thanks.
>
>
> MDS log:
>
> 2023-05-24T06:21:36.831+1000 7efe56e7d700  1 mds.0.cache.den(0x600 1005480d3b2) loaded already corrupt dentry: [dentry #0x100/stray0/1005480d3b2 [19ce,head] rep@0,-2.0<mailto:rep@0,-2.0> NULL (dversion lock) pv=0 v=2154265030 ino=(nil) state=0 0x556433addb80]
>
>     -5> 2023-05-24T06:21:36.831+1000 7efe56e7d700 -1 mds.0.damage notify_dentry Damage to dentries in fragment * of ino 0x600is fatal because it is a system directory for this rank
>
>     -4> 2023-05-24T06:21:36.831+1000 7efe56e7d700  5 mds.beacon.posco set_want_state: up:active -> down:damaged
>
>     -3> 2023-05-24T06:21:36.831+1000 7efe56e7d700  5 mds.beacon.posco Sending beacon down:damaged seq 5339
>
>     -2> 2023-05-24T06:21:36.831+1000 7efe56e7d700 10 monclient: _send_mon_message to mon.ceph-3 at v2:10.120.0.146:3300/0
>
>     -1> 2023-05-24T06:21:37.659+1000 7efe60690700  5 mds.beacon.posco received beacon reply down:damaged seq 5339 rtt 0.827966
>
>      0> 2023-05-24T06:21:37.659+1000 7efe56e7d700  1 mds.posco respawn!
>
>
> Cluster info:
> root@ceph-1:~# ceph -s
>   cluster:
>     id:     e2b93a76-2f97-4b34-8670-727d6ac72a64
>     health: HEALTH_ERR
>             1 filesystem is degraded
>             1 filesystem is offline
>             1 mds daemon damaged
>
>   services:
>     mon: 3 daemons, quorum ceph-1,ceph-2,ceph-3 (age 26h)
>     mgr: ceph-3(active, since 15h), standbys: ceph-1, ceph-2
>     mds: 0/1 daemons up, 3 standby
>     osd: 135 osds: 133 up (since 10h), 133 in (since 2w)
>
>   data:
>     volumes: 0/1 healthy, 1 recovering; 1 damaged
>     pools:   4 pools, 4161 pgs
>     objects: 230.30M objects, 276 TiB
>     usage:   836 TiB used, 460 TiB / 1.3 PiB avail
>     pgs:     4138 active+clean
>              13   active+clean+scrubbing
>              10   active+clean+scrubbing+deep
>
>
>
> root@ceph-1:~# ceph health detail
> HEALTH_ERR 1 filesystem is degraded; 1 filesystem is offline; 1 mds daemon damaged
> [WRN] FS_DEGRADED: 1 filesystem is degraded
>     fs cephfs is degraded
> [ERR] MDS_ALL_DOWN: 1 filesystem is offline
>     fs cephfs is offline because no MDS is active for it.
> [ERR] MDS_DAMAGE: 1 mds daemon damaged
>     fs cephfs mds.0 is damaged

Do you have a complete log you can share? Try:

https://docs.ceph.com/en/quincy/man/8/ceph-post-file/

To get your upgrade to complete, you may set:

ceph config set mds mds_go_bad_corrupt_dentry false

--
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux