Hi Anthony, Regarding the need to upgrade Ceph, we are upgrading our current OpenStack from Queens (yeah, very old) to Antelope and the openstack vendor required us to upgrade Ceph from Luminous to Nautilus for their migration code to work as the framework they are using to migrate/upgrade only works with Nautilus and above. --Pardhiv On Tue, Feb 18, 2025 at 11:01 AM Pardhiv Karri <meher4india@xxxxxxxxx> wrote: > Hi Anthony, > > Thank you for the reply. Here is the output from the monitor node. The > monitor (includes manager) and OSD nodes have been rebooted sequentially > after the upgrade to Nautilus, so I wonder why they are still > showing luminous now. Anyway I can fix? > > or1sz2 [root@mon1 ~]# ceph features > { > "mon": [ > { > "features": "0x3ffddff8ffecffff", > "release": "luminous", > "num": 3 > } > ], > "osd": [ > { > "features": "0x3ffddff8ffecffff", > "release": "luminous", > "num": 111 > } > ], > "client": [ > { > "features": "0x3ffddff8ffecffff", > "release": "luminous", > "num": 322 > } > ], > "mgr": [ > { > "features": "0x3ffddff8ffecffff", > "release": "luminous", > "num": 3 > } > ] > } > or1sz2 [root@mon1 ~]# dpkg -l | grep -i ceph > ii ceph 14.2.22-1xenial > amd64 distributed storage and file system > ii ceph-base 14.2.22-1xenial > amd64 common ceph daemon libraries and management tools > ii ceph-common 14.2.22-1xenial > amd64 common utilities to mount and interact with a ceph > storage cluster > ii ceph-deploy 2.0.1 > all Ceph-deploy is an easy to use configuration tool > ii ceph-mgr 14.2.22-1xenial > amd64 manager for the ceph distributed storage system > ii ceph-mon 14.2.22-1xenial > amd64 monitor server for the ceph storage system > ii ceph-osd 14.2.22-1xenial > amd64 OSD server for the ceph storage system > rc libcephfs1 10.2.11-1trusty > amd64 Ceph distributed file system client library > ii libcephfs2 14.2.22-1xenial > amd64 Ceph distributed file system client library > ii python-ceph-argparse 14.2.22-1xenial > all Python 2 utility libraries for Ceph CLI > ii python-cephfs 14.2.22-1xenial > amd64 Python 2 libraries for the Ceph libcephfs library > ii python-rados 14.2.22-1xenial > amd64 Python 2 libraries for the Ceph librados library > ii python-rbd 14.2.22-1xenial > amd64 Python 2 libraries for the Ceph librbd library > ii python-rgw 14.2.22-1xenial > amd64 Python 2 libraries for the Ceph librgw library > or1sz2 [root@or1dra1300 ~]# > > Thanks, > Pardhiv > > > > > On Tue, Feb 18, 2025 at 10:55 AM Anthony D'Atri <anthony.datri@xxxxxxxxx> > wrote: > >> This is one of the pitfalls of package-based installs. This dynamic with >> Nova and other virtualization systems has been well-known for at least a >> dozen years. >> >> I would not expect a Luminous client (i.e. librbd / librados) to have an >> issue, though — it should be able to handle pg-upmap. If you have a >> reference indicating the need to update to the Nautilus client, please send >> it along. >> >> I wonder if you have clients that are actually older than Luminous, that >> could cause problems. >> >> Cf https://tracker.ceph.com/issues/13301 >> >> Run `ceph features` which should give you client info. An unfortunate >> wrinkle is that in the case of pg-upmap, some clients may report “jewel” >> but their feature bitmaps actually indicate compatibility with pg-upmap. >> If you see clients that are pre-Luminous, focus restarts and migrations on >> those. >> >> OpenStack components themselves sometimes have dependencies on Ceph >> versions, so I would look at those and at libvirt itself as well. >> >> On Feb 18, 2025, at 1:48 PM, Pardhiv Karri <meher4india@xxxxxxxxx> wrote: >> >> Hi, >> >> We recently upgraded our Ceph from Luminous to Nautilus and upgraded the >> ceph clients on OpenStack (using rbd). All went well and after a few days, >> we randomly saw instances getting stuck with libvirt_qemu_exporter, which >> is getting the libvirt stuck on Openstack compute nodes. We had to kill >> those instances process, and then libvirt is returning. But the issue is >> happening again on the compute nodes with other instances. Upon doing some >> research, I found that we need to migrate the instances to use the latest >> (nautilus) ceph client, as they still use the old(luminous) client when >> spun up. The only way to get them to have the Nautilus client is to live >> migrate or reboot. We have thousands of instances, and doing any of those >> takes a long time without impacting the customer. Is there any other fix >> to >> solve this issue without migrating or rebooting the instances? >> >> Error on compute hosts: (renamed host and instance id) >> >> Feb 18 00:08:00 cmp03 libvirtd[5362]: 2025-02-18 00:08:00.510+0000: 5627: >> warning : qemuDomainObjBeginJobInternal:4933 : Cannot start job (query, >> none) for domain instance-009141b8; current job is (query, none) owned by >> (5628 remoteDispatchDomainBlockStats, 0 <null>) for (322330s, 0s) >> >> Thanks, >> Pardhiv >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >> >> >> > > -- > *Pardhiv Karri* > "Rise and Rise again until LAMBS become LIONS" > > > -- *Pardhiv Karri* "Rise and Rise again until LAMBS become LIONS" _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx