Hi Everyone,
I have a down system that has the MDS stuck in the rejoin state. When I
run ceph-mds with -d and --debug_mds 10 I get this repeating:
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 my compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dir
frag is stored in omap,7=mds uses inline data,8=no anchor table,9=file
layout v2,10=snaprealm v2}
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 mdsmap compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dir
frag is stored in omap,8=no anchor table,9=file layout v2,10=snaprealm v2}
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 my gid is 161986332
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 map says I am
mds.0.2365745 state up:rejoin
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 msgr says i am
[v2:172.23.0.44:6800/4094836140,v1:172.23.0.44:6801/4094836140]
2022-05-31 00:33:03.554 7fac80ee3700 10 mds.trex-ceph4 handle_mds_map:
handling map as rank 0
2022-05-31 00:33:03.557 7fac83972700 5 mds.beacon.trex-ceph4 received
beacon reply up:rejoin seq 31 rtt 0.21701
2022-05-31 00:33:04.185 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:05.182 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:05.182 7fac7c6da700 10 mds.0.cache releasing free memory
2022-05-31 00:33:06.182 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:07.183 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:07.341 7fac7dedd700 5 mds.beacon.trex-ceph4 Sending
beacon up:rejoin seq 32
2022-05-31 00:33:07.341 7fac83972700 5 mds.beacon.trex-ceph4 received
beacon reply up:rejoin seq 32 rtt 0
2022-05-31 00:33:08.183 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:09.184 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:10.184 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:11.185 7fac7c6da700 10 mds.0.cache cache not ready for
trimming
2022-05-31 00:33:11.341 7fac7dedd700 5 mds.beacon.trex-ceph4 Sending
beacon up:rejoin seq 33
2022-05-31 00:33:11.397 7fac80ee3700 1 mds.trex-ceph4 Updating MDS map
to version 2365758 from mon.0
and it just stays in that state seemingly forever. Also it seems to be
doing nothing cpu wise. I don't even know where to look at this point.
I see this in the mon log:
2022-05-31 00:36:27.359 7f39d0c6c700 1 mon.trex-ceph1@0(leader).osd
e51026 _set_new_cache_sizes cache_size:1020054731 inc_alloc: 301989888
full_alloc: 322961408 kv_alloc: 390070272
I'm falling asleep at the keyboard trying to get this to work. Any thoughts?
Thanks
-Dave
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx