Cinder pool inaccessible after Nautilus upgrade

Adrien Georget <adrien.georget@xxxxxxxxxxx> · Tue, 2 Jul 2019 13:35:23 +0200



    Hi all,

    
    I'm facing a very strange issue after migrating my Luminous cluster
    to Nautilus.

    I have 2 pools configured for Openstack cinder volumes with multiple
    backend setup, One "service" Ceph pool with cache tiering and one
    "R&D" Ceph pool.

    After the upgrade, the R&D pool became inaccessible for Cinder
    and the cinder-volume service using this pool can't start anymore.

    What is strange is that Openstack and Ceph report no error, Ceph
    cluster is healthy, all OSDs are UP & running and the "service"
    pool is still running well with the other cinder service on the same
    openstack host.

    I followed exactly the upgrade procedure
(https://ceph.com/releases/v14-2-0-nautilus-released/#upgrading-from-mimic-or-luminous),
    no problem during the upgrade but I can't understand why Cinder
    still fails with this pool.

    I can access, list, create volume on this pool with rbd or rados
    command from the monitors, but on the Openstack hypervisor the rbd
    or rados ls command stay stuck and rados ls give this message (134.158.208.37
      is an OSD node,10.158.246.214 an Openstack hypervisor)  :

    
    2019-07-02 11:26:15.999869 7f63484b4700  0 --
      10.158.246.214:0/1404677569 >> 134.158.208.37:6884/2457222
      pipe(0x555c2bf96240 sd=7 :0 s=1 pgs=0 cs=0 l=1
      c=0x555c2bf97500).fault

    
    ceph version 14.2.1

    Openstack Newton

    
    I spent 2 days checking everything on Ceph side but I couldn't find
    anything problematic...

    If you have any hints which can help me, I would appreciate :)

    
    Adrien

  
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com