Hi,
what happenend to the cluster? Several services report a short uptime
(68 minutes). If you shared some MDS logs someone might find a hint
why they won't become active. If the regular logs don't reveal
anything enable debug logs.
Zitat von Tobias Florek <ceph@xxxxxxxxxx>:
Hi!
I am running a rook managed hyperconverged ceph cluster on
kubernetes using ceph 17.2.3 with a single-rank single fs cephfs.
I am now facing the problem that the mds's stay in up:standby. I
tried setting allow_standby_replay to false and restarting both mds
daemons, but nothing changed.
ceph -s
cluster:
id: 08f51f08-9551-488f-9419-787a7717555e
health: HEALTH_ERR
1 filesystem is degraded
1 filesystem is offline
1 mds daemon damaged
services:
mon: 5 daemons, quorum cy,dt,du,dv,dw (age 68m)
mgr: a(active, since 64m), standbys: b
mds: 0/1 daemons up, 2 standby
osd: 10 osds: 10 up (since 68m), 10 in (since 3d)
data:
volumes: 0/1 healthy, 1 recovering; 1 damaged
pools: 14 pools, 273 pgs
objects: 834.69k objects, 1.2 TiB
usage: 3.7 TiB used, 23 TiB / 26 TiB avail
pgs: 273 active+clean
The journal looks ok though:
cephfs-journal-tool --rank cephfs:0 journal inspect
Overall journal integrity: OK
cephfs-journal-tool --rank cephfs:0 header get
{
"magic": "ceph fs volume v011",
"write_pos": 2344234253408,
"expire_pos": 2344068406026,
"trimmed_pos": 2344041316352,
"stream_format": 1,
"layout": {
"stripe_unit": 4194304,
"stripe_count": 1,
"object_size": 4194304,
"pool_id": 10,
"pool_ns": ""
}
}
cephfs-journal-tool --rank cephfs:0 event get summary
Events by type:
OPEN: 47779
SESSION: 24
SUBTREEMAP: 113
UPDATE: 53346
Errors: 0
ceph fs dump
e269368
enable_multiple, ever_enabled_multiple: 1,1
default compat: compat={},rocompat={},incompat={1=base
v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir
inode in separate object,5=
mds uses versioned encoding,6=dirfrag is stored in omap,8=no anchor
table,9=file layout v2,10=snaprealm v2}
legacy client fscid: 1
Filesystem 'cephfs' (1)
fs_name cephfs
epoch 269356
flags 32 joinable allow_snaps allow_multimds_snaps allow_standby_replay
created 2020-05-05T21:54:21.907356+0000
modified 2022-09-07T13:32:13.263940+0000
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
required_client_features {}
last_failure 0
last_failure_osd_epoch 69305
compat compat={},rocompat={},incompat={1=base v0.20,2=client
writeable ranges,3=default file layouts on dirs,4=dir inode in
separate object,5=mds uses
versioned encoding,6=dirfrag is stored in omap,7=mds uses inline
data,8=no anchor table,9=file layout v2,10=snaprealm v2}
max_mds 1
in 0
up {}
failed
damaged 0
stopped
data_pools [11,14]
metadata_pool 10
inline_data disabled
balancer
standby_count_wanted 1
Standby daemons:
[mds.cephfs-a{-1:94490181} state up:standby seq 1 join_fscid=1 addr
[v2:172.21.0.75:6800/3162134136,v1:172.21.0.75:6801/3162134136]
compat {c=[1],r=[1]
,i=[7ff]}]
[mds.cephfs-b{-1:94519600} state up:standby seq 1 join_fscid=1 addr
[v2:172.21.0.76:6800/2282837495,v1:172.21.0.76:6801/2282837495]
compat {c=[1],r=[1]
,i=[7ff]}]
dumped fsmap epoch 269368
Thank you for your help!
Tobias Florek
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx