Can't upgrade to MDS version 12.2.8

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello there,

I've tried setting up some MDS VMs with version 12.2.8 but they are
unable to replay which appears to be caused by an error on the monitors.

2018-09-01 18:52:39.101001 7fb7c4c4f700  1 mon.mon2@1(peon).mds e3320
mds mds.? 10.14.4.62:6800/605610442 can't write to fsmap
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no
anchor table,9=file layout v2}. There are the debug logs of the MDS:

2018-09-02 00:16:36.530702 7f2df33d3200  0 ceph version 12.2.7
(3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5) luminous (stable), process
ceph-mds, pid 10303
2018-09-02 00:16:36.533294 7f2df33d3200  0 pidfile_write: ignore empty
--pid-file
2018-09-02 00:16:36.534982 7f2df33d3200 10 mds.ceph-mds1 112 MDSCacheObject
2018-09-02 00:16:36.534983 7f2df33d3200 10 mds.ceph-mds1 1776 CInode
2018-09-02 00:16:36.534984 7f2df33d3200 10 mds.ceph-mds1 16
elist<>::item   *7=112
2018-09-02 00:16:36.534985 7f2df33d3200 10 mds.ceph-mds1 536 inode
2018-09-02 00:16:36.534986 7f2df33d3200 10 mds.ceph-mds1 552 old_inode
2018-09-02 00:16:36.534988 7f2df33d3200 10 mds.ceph-mds1 48
nest_info_t
2018-09-02 00:16:36.534988 7f2df33d3200 10 mds.ceph-mds1 40
frag_info_t
2018-09-02 00:16:36.534989 7f2df33d3200 10 mds.ceph-mds1 40 SimpleLock
*5=200
2018-09-02 00:16:36.534990 7f2df33d3200 10 mds.ceph-mds1 48 ScatterLock
*3=144
2018-09-02 00:16:36.534991 7f2df33d3200 10 mds.ceph-mds1 512 CDentry
2018-09-02 00:16:36.534992 7f2df33d3200 10 mds.ceph-mds1 16 elist<>::item
2018-09-02 00:16:36.534993 7f2df33d3200 10 mds.ceph-mds1 40 SimpleLock
2018-09-02 00:16:36.534993 7f2df33d3200 10 mds.ceph-mds1 1608 CDir
2018-09-02 00:16:36.534994 7f2df33d3200 10 mds.ceph-mds1 16
elist<>::item   *2=32
2018-09-02 00:16:36.534995 7f2df33d3200 10 mds.ceph-mds1 232 fnode_t
2018-09-02 00:16:36.534996 7f2df33d3200 10 mds.ceph-mds1 48
nest_info_t *2
2018-09-02 00:16:36.534997 7f2df33d3200 10 mds.ceph-mds1 40
frag_info_t *2
2018-09-02 00:16:36.534998 7f2df33d3200 10 mds.ceph-mds1 288 Capability
2018-09-02 00:16:36.534998 7f2df33d3200 10 mds.ceph-mds1 32
xlist<>::item   *2=64
2018-09-02 00:16:36.536024 7f2dee63a700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mon
2018-09-02 00:16:36.539439 7f2df33d3200 10 mds.beacon.ceph-mds1 _send
up:boot seq 1
2018-09-02 00:16:36.539628 7f2dec699700  5 mds.ceph-mds1 handle_mds_map
epoch 3603 from mon.2
2018-09-02 00:16:36.539673 7f2dec699700 10 mds.ceph-mds1      my compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in oma
p,7=mds uses inline data,8=no anchor table,9=file layout v2}
2018-09-02 00:16:36.539680 7f2dec699700 10 mds.ceph-mds1  mdsmap compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in oma
p,8=no anchor table,9=file layout v2}
2018-09-02 00:16:36.539684 7f2dec699700 10 mds.ceph-mds1 map says I am
10.14.4.61:6800/938934393 mds.-1.-1 state ???
2018-09-02 00:16:36.539699 7f2dec699700 10 mds.ceph-mds1 handle_mds_map:
handling map in rankless mode
2018-09-02 00:16:36.539711 7f2dec699700 10 mds.ceph-mds1 not in map yet
2018-09-02 00:16:36.543853 7f2deee3b700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mgr
2018-09-02 00:16:37.540991 7f2deee3b700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mgr
2018-09-02 00:16:38.541422 7f2deee3b700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mgr
2018-09-02 00:16:39.541559 7f2deee3b700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mgr
2018-09-02 00:16:40.539545 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:boot seq 2
2018-09-02 00:16:40.541599 7f2deee3b700 10 mds.ceph-mds1
MDSDaemon::ms_get_authorizer type=mgr
2018-09-02 00:16:40.939600 7f2dec699700  5 mds.ceph-mds1 handle_mds_map
epoch 3604 from mon.2
2018-09-02 00:16:40.939623 7f2dec699700 10 mds.ceph-mds1      my compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in oma
p,7=mds uses inline data,8=no anchor table,9=file layout v2}
2018-09-02 00:16:40.939626 7f2dec699700 10 mds.ceph-mds1  mdsmap compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in oma
p,8=no anchor table,9=file layout v2}
2018-09-02 00:16:40.939629 7f2dec699700 10 mds.ceph-mds1 map says I am
10.14.4.61:6800/938934393 mds.-1.0 state up:standby
2018-09-02 00:16:40.939632 7f2dec699700 10 mds.ceph-mds1 handle_mds_map:
handling map in rankless mode
2018-09-02 00:16:40.939638 7f2dec699700 10 mds.beacon.ceph-mds1
set_want_state: up:boot -> up:standby
2018-09-02 00:16:40.939639 7f2dec699700  1 mds.ceph-mds1 handle_mds_map
standby
2018-09-02 00:16:40.940132 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:boot seq 2 rtt 0.400562
2018-09-02 00:16:40.958557 7f2dec699700  5 mds.ceph-mds1 handle_mds_map
epoch 3605 from mon.2
2018-09-02 00:16:40.958579 7f2dec699700 10 mds.ceph-mds1      my compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds
uses inline data,8=no anchor table,9=file layout v2}
2018-09-02 00:16:40.958582 7f2dec699700 10 mds.ceph-mds1  mdsmap compat
compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate
object,5=mds uses versioned encoding,6=dirfrag is stored in omap,8=no
anchor table,9=file layout v2}
2018-09-02 00:16:40.958584 7f2dec699700 10 mds.ceph-mds1 map says I am
10.14.4.61:6800/938934393 mds.0.0 state up:standby-replay
2018-09-02 00:16:40.958721 7f2dec699700  4 mds.0.purge_queue
operator():  data pool 16 not found in OSDMap
2018-09-02 00:16:40.958772 7f2dec699700 10 mds.ceph-mds1 handle_mds_map:
initializing MDS rank 0
2018-09-02 00:16:40.958985 7f2dec699700 10 mds.0.0 update_log_config
log_to_monitors {default=true}
2018-09-02 00:16:40.958987 7f2dec699700 10 mds.0.0 create_logger
2018-09-02 00:16:40.959109 7f2dec699700  7 mds.0.server operator(): full
= 0 epoch = 0
2018-09-02 00:16:40.959118 7f2dec699700  4 mds.0.purge_queue
operator():  data pool 16 not found in OSDMap
2018-09-02 00:16:40.959121 7f2dec699700  4 mds.0.0 handle_osd_map epoch
0, 0 new blacklist entries
2018-09-02 00:16:40.959128 7f2dec699700 10 mds.0.server apply_blacklist:
killed 0
2018-09-02 00:16:40.959236 7f2dec699700 10 mds.ceph-mds1 handle_mds_map:
handling map as rank 0
2018-09-02 00:16:40.959245 7f2dec699700  1 mds.0.0 handle_mds_map i am
now mds.201185519.0 replaying mds.0.0
2018-09-02 00:16:40.959247 7f2dec699700  1 mds.0.0 handle_mds_map state
change up:boot --> up:standby-replay
2018-09-02 00:16:40.959256 7f2dec699700 10 mds.beacon.ceph-mds1
set_want_state: up:standby -> up:standby-replay
2018-09-02 00:16:40.959258 7f2dec699700  1 mds.0.0 replay_start
2018-09-02 00:16:40.959261 7f2dec699700  7 mds.0.cache set_recovery_set
2018-09-02 00:16:40.959262 7f2dec699700  1 mds.0.0  recovery set is
2018-09-02 00:16:40.959266 7f2dec699700  2 mds.0.0 boot_start 0: opening
inotable
2018-09-02 00:16:40.959269 7f2dec699700 10 mds.0.inotable: load
2018-09-02 00:16:40.959303 7f2dec699700  2 mds.0.0 boot_start 0: opening
sessionmap
2018-09-02 00:16:40.959305 7f2dec699700 10 mds.0.sessionmap load
2018-09-02 00:16:40.959321 7f2dec699700  2 mds.0.0 boot_start 0: opening
mds log
2018-09-02 00:16:40.959322 7f2dec699700  5 mds.0.log open discovering
log bounds
2018-09-02 00:16:40.959377 7f2dec699700  2 mds.0.0 boot_start 0: opening
snap table
2018-09-02 00:16:40.959382 7f2dec699700 10 mds.0.snaptable: load
2018-09-02 00:16:40.959411 7f2de568b700 10 mds.0.log _submit_thread start
2018-09-02 00:16:40.959862 7f2dec699700  7 mds.0.server operator(): full
= 0 epoch = 17318
2018-09-02 00:16:40.959871 7f2dec699700  4 mds.0.0 handle_osd_map epoch
17318, 0 new blacklist entries
2018-09-02 00:16:40.959874 7f2dec699700 10 mds.0.server apply_blacklist:
killed 0
2018-09-02 00:16:44.539645 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 3
2018-09-02 00:16:44.541567 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 3 rtt 0.001896
2018-09-02 00:16:48.539763 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 4
2018-09-02 00:16:52.539853 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 5
2018-09-02 00:16:54.256642 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 4 rtt 5.716853
2018-09-02 00:16:54.284305 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 5 rtt 1.744427
2018-09-02 00:16:56.539954 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 6
2018-09-02 00:16:57.258121 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 6 rtt 0.718152
2018-09-02 00:17:00.540070 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 7
2018-09-02 00:17:00.541845 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 7 rtt 0.001748
2018-09-02 00:17:04.540150 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 8
2018-09-02 00:17:05.625047 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 8 rtt 1.084827
2018-09-02 00:17:08.540241 7f2de9693700 10 mds.beacon.ceph-mds1 _send
up:standby-replay seq 9
2018-09-02 00:17:08.558553 7f2dec699700 10 mds.beacon.ceph-mds1
handle_mds_beacon up:standby-replay seq 9 rtt 0.018286

I can see there is an incompatibility problem but I'm not sure how to
resolve this, I'm not sure but it could be caused by the fact that the
MDSes that were starting on Kubernetes started running newer versions
(since there are no version tags) which caused some version mismatch.

I tried to follow the instructions in
http://docs.ceph.com/docs/mimic/cephfs/upgrading/ but I'm not able to
follow the 'stop MDS', 'upgrade' and 'start MDS' instructions since
these MDSes are running in Kubernetes and I'm worried to upgrade these
to 12.2.8 since I can't go back (there are no Docker image tags for
12.2.7, or any specific version for that matter:
https://hub.docker.com/r/ceph/daemon/tags/).

I tried manually installing the 12.2.7 version Debian packages on one of
the VMs but that doesn't seem to help as the VM still runs into the same
issues as it did with 12.2.8, even though the 12.2.7 instances in
Kubernetes are working fine.

I hope someone will be able to help with this situation.

Kind regards,

Marlinc

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux