Hi, It would seem that the order of declaration of mons addresses (v2 then v1 and not the other way around) is important. Albert restarted all services after this modification and everything is back to normal Le mer. 17 juil. 2024 à 09:40, David C. <david.casier@xxxxxxxx> a écrit : > Hi Frédéric, > > The curiosity of Albert's cluster is that (msgr) v1 and v2 are present on > the mons, as well as on the osds backend. > > But v2 is absent on the public OSD and MDS network > > The specific point is that the public network has been changed. > > At first, I thought it was the order of declaration of my_host (v1 before > v2) but apparently, that's not it. > > > Le mer. 17 juil. 2024 à 09:21, Frédéric Nass < > frederic.nass@xxxxxxxxxxxxxxxx> a écrit : > >> Hi David, >> >> Redeploying 2 out of 3 MONs a few weeks back (to have them using RocksDB >> to be ready for Quincy) prevented some clients from connecting to the >> cluster and mounting cephfs volumes. >> >> Before the redeploy, these clients were using port 6789 (v1) explicitly >> as connections wouldn't work with port 3300 (v2). >> After the redeploy, removing port 6789 from mon_ips fixed the situation. >> >> Seems like msgr v2 activation did only occur after all 3 MONs were >> redeployed and used RocksDB. Not sure why this happened though. >> >> @Albert, if this cluster has been upgrade several times, you might want >> to check /var/lib/ceph/$(ceph fsid)/kv_backend, redeploy the MONS if >> leveldb, make sure all clients use the new mon_host syntax in ceph.conf >> ([v2:<cthulhu1_ip>:3300,v1:<cthulhu1_ip>:6789],etc.]) and check their >> ability to connect to port 3300. >> >> Cheers, >> Frédéric. >> >> ----- Le 16 Juil 24, à 17:53, David david.casier@xxxxxxxx a écrit : >> >> > Albert, >> > >> > The network is ok. >> > >> > However, strangely, the osd and mds did not activate msgr v2 (msgr v2 >> was >> > activated on mon). >> > >> > It is possible to bypass by adding the "ms_mode=legacy" option but you >> need >> > to find out why msgr v2 is not activated >> > >> > >> > Le mar. 16 juil. 2024 à 15:18, Albert Shih <Albert.Shih@xxxxxxxx> a >> écrit : >> > >> >> Le 16/07/2024 à 15:04:05+0200, David C. a écrit >> >> Hi, >> >> >> >> > >> >> > I think it's related to your network change. >> >> >> >> I though about it but in that case why the old (and before upgrade) >> server >> >> works ? >> >> >> >> > Can you send me the return of "ceph report" ? >> >> >> >> Nothing related to the old subnet. (see attach file) >> >> >> >> Regards. >> >> >> >> JAS >> >> -- >> >> Albert SHIH 🦫 🐸 >> >> Observatoire de Paris >> >> France >> >> Heure locale/Local time: >> >> mar. 16 juil. 2024 15:14:21 CEST >> >> >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@xxxxxxx >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >> > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx