Re: mds crashes after up:replay state

Lars Köppel <lars.koeppel@xxxxxxxxxx> · Sat, 6 Jan 2024 13:22:14 +0100

Hi Patrick,

thank you for your response.
I already changed the mentioned settings, but I had no luck with this.

The journal inspection I had running yesterday finished with: 'Overall
journal integrity: OK'.
So you are probably right that the mds is crashing shortly after the replay
finished.

I checked the logs and there is every few seconds a new FSMap epoch without
any visible changes. One of the current epochs is at the end. Is there
anything useful in it?

When the replay is finished the running mds goes to the state
'up:reconnect' and after a second to the state 'up:rejoin'. After this
there is for ~20 min no new fsmap until this message pops up:

> Jan 06 12:38:23 storage01 ceph-mds[223997]:
> mds.beacon.cephfs.storage01.pgperp Skipping beacon heartbeat to monitors
> (last acked 4.00012s ago); MDS internal heartbeat is not healthy!
>
A few seconds later (the heartbeat message is still there) a new fsmap is
created with a new mds now in replay state.
The last of the heartbeat messages is after 1446 seconds. Then it is gone
and no more warnings or errors are displayed at this point. One minute
after the last message the mds is back as standy mds.

> Jan 06 13:02:26 storage01 ceph-mds[223997]:
> mds.beacon.cephfs.storage01.pgperp Skipping beacon heartbeat to monitors
> (last acked 1446.6s ago); MDS internal heartbeat is not healthy!
>

Also i can not find any warning in the logs when the mds crashes. What
could I do to find the error for the crash?

Best regardes
Lars

e205510
> enable_multiple, ever_enabled_multiple: 1,1
> default compat: compat={},rocompat={},incompat={1=base v0.20,2=client
> writeable ranges,3=default file layouts on dirs,4=dir inode in separate
> object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no
> anchor table,9=file layout v2,10=snaprealm v2}
> legacy client fscid: 3
>
> Filesystem 'cephfs' (3)
> fs_name cephfs
> epoch   205510
> flags   32 joinable allow_snaps allow_multimds_snaps allow_standby_replay
> created 2023-06-06T11:44:03.651905+0000
> modified        2024-01-06T10:28:14.676738+0000
> tableserver     0
> root    0
> session_timeout 60
> session_autoclose       300
> max_file_size   8796093022208
> required_client_features        {}
> last_failure    0
> last_failure_osd_epoch  42962
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline
> data,8=no anchor table,9=file layout v2,10=snaprealm v2}
> max_mds 1
> in      0
> up      {0=2178448}
> failed
> damaged
> stopped
> data_pools      [11,12]
> metadata_pool   10
> inline_data     disabled
> balancer
> standby_count_wanted    1
> [mds.cephfs.storage01.pgperp{0:2178448} state up:replay seq 4484
> join_fscid=3 addr [v2:
> 192.168.0.101:6800/855849996,v1:192.168.0.101:6801/855849996] compat
> {c=[1],r=[1],i=[7ff]}]
>
>
> Filesystem 'cephfs_recovery' (4)
> fs_name cephfs_recovery
> epoch   193460
> flags   13 allow_snaps allow_multimds_snaps
> created 2024-01-05T10:47:32.224388+0000
> modified        2024-01-05T16:43:37.677241+0000
> tableserver     0
> root    0
> session_timeout 60
> session_autoclose       300
> max_file_size   1099511627776
> required_client_features        {}
> last_failure    0
> last_failure_osd_epoch  42904
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline
> data,8=no anchor table,9=file layout v2,10=snaprealm v2}
> max_mds 1
> in      0
> up      {}
> failed
> damaged 0
> stopped
> data_pools      [11,12]
> metadata_pool   13
> inline_data     disabled
> balancer
> standby_count_wanted    1
>
>
> Standby daemons:
>
> [mds.cephfs.storage02.zopcif{-1:2356728} state up:standby seq 1
> join_fscid=3 addr [v2:
> 192.168.0.102:6800/3567764205,v1:192.168.0.102:6801/3567764205] compat
> {c=[1],r=[1],i=[7ff]}]
> dumped fsmap epoch 205510
>

[image: ariadne.ai Logo] Lars Köppel
Developer
Email: lars.koeppel@xxxxxxxxxx
Phone: +49 6221 5993580 <+4962215993580>
ariadne.ai (Germany) GmbH
Häusserstraße 3, 69115 Heidelberg
Amtsgericht Mannheim, HRB 744040
Geschäftsführer: Dr. Fabian Svara
https://ariadne.ai

On Fri, Jan 5, 2024 at 7:52 PM Patrick Donnelly <pdonnell@xxxxxxxxxx> wrote:

> Hi Lars,
>
> On Fri, Jan 5, 2024 at 9:53 AM Lars Köppel <lars.koeppel@xxxxxxxxxx>
> wrote:
> >
> > Hello everyone,
> >
> > we are running a small cluster with 3 nodes and 25 osds per node. And
> Ceph
> > version 17.2.6.
> > Recently the active mds crashed and since then the new starting mds has
> > always been in the up:replay state. In the output of the command 'ceph
> tell
> > mds.cephfs:0 status' you can see that the journal is completely read in.
> As
> > soon as it's finished, the mds crashes and the next one starts reading
> the
> > journal.
> >
> > At the moment I have the journal inspection running ('cephfs-journal-tool
> > --rank=cephfs:0 journal inspect').
> >
> > Does anyone have any further suggestions on how I can get the cluster
> > running again as quickly as possible?
>
> Please review:
>
> https://docs.ceph.com/en/reef/cephfs/troubleshooting/#stuck-during-recovery
>
> Note: your MDS is probably not failing in up:replay but shortly after
> reaching one of the later states. Check the mon logs to see what the
> FSMap changes were.
>
>
> Patrick Donnelly, Ph.D.
> He / Him / His
> Red Hat Partner Engineer
> IBM, Inc.
> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx