ceph hammer : rbd info/Status : operation not supported (95) (EC+RBD tier pools)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

 

I just started testing VMs inside ceph this week, ceph-hammer 0.94-5 here.

 

I built several pools, using pool tiering:

-          A small replicated SSD pool (5 SSDs only, but I thought it’d be better for IOPS, I intend to test the difference with disks only)

-          Overlaying a larger EC pool

 

I just have 2 VMs in Ceph… and one of them is breaking something.

The VM that is not breaking was migrated using qemu-img for creating the ceph volume, then migrating the data. Its rbd format is 1 :

rbd image 'xxx-disk1':

        size 20480 MB in 5120 objects

        order 22 (4096 kB objects)

        block_name_prefix: rb.0.83a49.3d1b58ba

        format: 1

 

The VM that’s failing has a rbd format 2

this is what I had before things started breaking :

rbd image 'yyy-disk1':

        size 10240 MB in 2560 objects

        order 22 (4096 kB objects)

        block_name_prefix: rbd_data.8ae1f47398c89

        format: 2

        features: layering, striping

        flags:

        stripe unit: 4096 kB

        stripe count: 1

 

 

The VM started behaving weirdly with a huge IOwait % during its install (that’s to say it did not take long to go wrong ;) )

Now, this is the only thing that I can get

 

[root@ceph0 ~]# rbd -p irfu-virt info yyy-disk1

2016-02-24 18:30:33.213590 7f00e6f6d7c0 -1 librbd::ImageCtx: error reading image id: (95) Operation not supported

rbd: error opening image yyy-disk1: (95) Operation not supported

 

One thing to note : the VM *IS STILL* working : I can still do disk operations, apparently.

During the VM installation, I realized I wrongly set the target SSD caching size to 100Mbytes, instead of 100Gbytes, and ceph complained it was almost full :

     health HEALTH_WARN

            'ssd-hot-irfu-virt' at/near target max

 

My question is…… am I facing the bug as reported in this list thread with title “Possible Cache Tier Bug - Can someone confirm” ?

Or did I do something wrong ?

 

The libvirt and kvm that are writing into ceph are the following :

libvirt-1.2.17-13.el7_2.3.x86_64

qemu-kvm-1.5.3-105.el7_2.3.x86_64

 

Any idea how I could recover the VM file, if possible ?

Please note I have no problem with deleting the VM and rebuilding it, I just spawned it to test.

As a matter of fact, I just “virsh destroyed” the VM, to see if I could start it again… and I cant :

 

# virsh start yyy

error: Failed to start domain yyy

error: internal error: process exited while connecting to monitor: 2016-02-24T17:49:59.262170Z qemu-kvm: -drive file=rbd:irfu-virt/yyy-disk1:id=irfu-virt:key=***==:auth_supported=cephx\;none:mon_host=_____\:6789,if=none,id=drive-virtio-disk0,format=raw: error reading header from yyy-disk1

2016-02-24T17:49:59.263743Z qemu-kvm: -drive file=rbd:irfu-virt/yyy-disk1:id=irfu-virt:key=A***==:auth_supported=cephx\;none:mon_host=___\:6789,if=none,id=drive-virtio-disk0,format=raw: could not open disk image rbd:irfu-virt/___-disk1:id=irfu-***==:auth_supported=cephx\;none:mon_host=___\:6789: Could not open 'rbd:irfu-virt/yyy-disk1:id=irfu-virt:key=***

 

Ideas ?

Thanks

Frederic

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux