Hi all, I'm facing a very strange issue after migrating my Luminous cluster to Nautilus. I have 2 pools configured for Openstack cinder volumes with multiple backend setup, One "service" Ceph pool with cache tiering and one "R&D" Ceph pool. After the upgrade, the R&D pool became inaccessible for Cinder and the cinder-volume service using this pool can't start anymore. What is strange is that Openstack and Ceph report no error, Ceph cluster is healthy, all OSDs are UP & running and the "service" pool is still running well with the other cinder service on the same openstack host. I followed exactly the upgrade procedure (https://ceph.com/releases/v14-2-0-nautilus-released/#upgrading-from-mimic-or-luminous), no problem during the upgrade but I can't understand why Cinder still fails with this pool. I can access, list, create volume on this pool with rbd or rados command from the monitors, but on the Openstack hypervisor the rbd or rados ls command stay stuck and rados ls give this message ( 134.158.208.37
is an OSD node,10.158.246.214 an Openstack hypervisor) :2019-07-02 11:26:15.999869 7f63484b4700 0 --
10.158.246.214:0/1404677569 >> 134.158.208.37:6884/2457222
pipe(0x555c2bf96240 sd=7 :0 s=1 pgs=0 cs=0 l=1
c=0x555c2bf97500).fault ceph version 14.2.1 Openstack Newton I spent 2 days checking everything on Ceph side but I couldn't find anything problematic... If you have any hints which can help me, I would appreciate :) Adrien |
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com