The issue is, I can create the volume but I can attach to instance only if it is in shutdown state.
If an instance is already in shutdown state and I attach a volume, and then if i restart the instance, it goes into "error state"
The logs are attached.
Jul 23 17:06:10 master 2013-07-23 17:06:10.513 ERROR nova.compute.manager [#033[01;36mreq-ecff0f93-aa84-4471-aa47-4628c790fa54 #033[00;36madmin admin] #033[01;35m[instance: e1c8a73a-ff63-4c09-b24a-2ab755aa4836] Cannot reboot instance: [Errno 32] Corrupt image download. Checksum was d2f67e6e12e87ce50a42b7f0c595cde2 expected c352f4e7121c6eae958bc1570324f17e#033[00m
Jul 23 17:06:10 master 2013-07-23 17:06:10.934 INFO nova.osapi_compute.wsgi.server [#033[00;36m-] #033[01;35m(3925) accepted ('171.71.119.2', 37555)#012#033[00m
Jul 23 17:06:11 master 2013-07-23 17:06:11.401 ERROR nova.openstack.common.rpc.amqp [#033[01;36mreq-ecff0f93-aa84-4471-aa47-4628c790fa54 #033[00;36madmin admin] #033[01;35mException during message handling#033[00m#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00mTraceback (most recent call last):#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 426, in _process_data#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m **args)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m result = getattr(proxyobj, method)(ctxt, **kwargs)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/exception.py", line 99, in wrapped#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m temp_level, payload)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m self.gen.next()#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/exception.py", line 76, in wrapped#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m return f(self, context, *args, **kw)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/compute/manager.py", line 228, in decorated_function#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m pass#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/usr/lib/python2.7/contextlib.py",
Jul 23 17:06:10 master 2013-07-23 17:06:10.513 ERROR nova.compute.manager [#033[01;36mreq-ecff0f93-aa84-4471-aa47-4628c790fa54 #033[00;36madmin admin] #033[01;35m[instance: e1c8a73a-ff63-4c09-b24a-2ab755aa4836] Cannot reboot instance: [Errno 32] Corrupt image download. Checksum was d2f67e6e12e87ce50a42b7f0c595cde2 expected c352f4e7121c6eae958bc1570324f17e#033[00m
Jul 23 17:06:10 master 2013-07-23 17:06:10.934 INFO nova.osapi_compute.wsgi.server [#033[00;36m-] #033[01;35m(3925) accepted ('171.71.119.2', 37555)#012#033[00m
Jul 23 17:06:11 master 2013-07-23 17:06:11.401 ERROR nova.openstack.common.rpc.amqp [#033[01;36mreq-ecff0f93-aa84-4471-aa47-4628c790fa54 #033[00;36madmin admin] #033[01;35mException during message handling#033[00m#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00mTraceback (most recent call last):#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/openstack/common/rpc/amqp.py", line 426, in _process_data#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m **args)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m result = getattr(proxyobj, method)(ctxt, **kwargs)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/exception.py", line 99, in wrapped#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m temp_level, payload)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m self.gen.next()#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/exception.py", line 76, in wrapped#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m return f(self, context, *args, **kw)#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/opt/stack/nova/nova/compute/manager.py", line 228, in decorated_function#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m pass#0122013-07-23 17:06:11.401 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00m File "/usr/lib/python2.7/contextlib.py",
Logs collected when I rebooted another instance,
15:32.666 ERROR nova.compute.manager [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/compute/manager.py", line 4421, in _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/api.py", line 333, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0
Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR nova.virt.libvirt.driver [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35mAn error occurred while trying to launch a defined domain with xml: <domain type='qemu'>#012 <name>instance-0000000b</name>#012 <uuid>4b58dea1-f281-4818-82da-8b9f5f923f64</uuid>#012 <memory unit='KiB'>524288</memory>#012 <currentMemory unit='KiB'>524288</currentMemory>#012 <vcpu placement='static'>1</vcpu>#012 <sysinfo type='smbios'>#012 <system>#012 <entry name='manufacturer'>OpenStack Foundation</entry>#012 <entry name='product'>OpenStack Nova</entry>#012 <entry name='version'>2013.2</entry>#012 <entry name='serial'>38047832-f758-4e6d-aedf-2d6cf02d6b1e</entry>#012 <entry name='uuid'>4b58dea1-f281-4818-82da-8b9f5f923f64</entry>#012 </system>#012 </sysinfo>#012 <os>#012 <type arch='x86_64' machine='pc-i440fx-1.4'>hvm</type>#012 <kernel>/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel</kernel>#012 <initrd>/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk</initrd>#012 <cmdline>root=/dev/vda console=tty0 console=ttyS0</cmdline>#012 <boot dev='hd'/>#012 <smbios mode='sysinfo'/>#012 </os>#012 <features>#012 <acpi/>#012 <apic/>#012 </features>#012 <clock offset='utc'/>#012 <on_poweroff>destroy</on_poweroff>#012 <on_reboot>restart</on_reboot>#012 <on_crash>destroy</on_crash>#012 <devices>#012 <emulator>/usr/bin/qemu-system-x86_64</emulator>#012 <disk type='file' device='disk'>#012 <driver name='qemu' type='qcow2' cache='none'/>#012 <source file='/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/disk'/>#012 <target dev='vda' bus='virtio'/>#012 <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>#012 </disk>#012 <disk type='network' device='disk'>#012 <driver name='qemu' type='raw' cache='none'/>#012 <auth username='volumes'>#012 <secret type='ceph' uuid='62d0b384-5
Jul 23 17:17:18 slave2 2013-07-23 17:17:18.410 ERROR nova.compute.manager [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] Cannot reboot instance: internal error rbd username 'volumes' specified but secret not found#033[00m
15:32.666 ERROR nova.compute.manager [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/compute/manager.py", line 4421, in _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/api.py", line 333, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0
Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR nova.virt.libvirt.driver [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35mAn error occurred while trying to launch a defined domain with xml: <domain type='qemu'>#012 <name>instance-0000000b</name>#012 <uuid>4b58dea1-f281-4818-82da-8b9f5f923f64</uuid>#012 <memory unit='KiB'>524288</memory>#012 <currentMemory unit='KiB'>524288</currentMemory>#012 <vcpu placement='static'>1</vcpu>#012 <sysinfo type='smbios'>#012 <system>#012 <entry name='manufacturer'>OpenStack Foundation</entry>#012 <entry name='product'>OpenStack Nova</entry>#012 <entry name='version'>2013.2</entry>#012 <entry name='serial'>38047832-f758-4e6d-aedf-2d6cf02d6b1e</entry>#012 <entry name='uuid'>4b58dea1-f281-4818-82da-8b9f5f923f64</entry>#012 </system>#012 </sysinfo>#012 <os>#012 <type arch='x86_64' machine='pc-i440fx-1.4'>hvm</type>#012 <kernel>/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel</kernel>#012 <initrd>/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk</initrd>#012 <cmdline>root=/dev/vda console=tty0 console=ttyS0</cmdline>#012 <boot dev='hd'/>#012 <smbios mode='sysinfo'/>#012 </os>#012 <features>#012 <acpi/>#012 <apic/>#012 </features>#012 <clock offset='utc'/>#012 <on_poweroff>destroy</on_poweroff>#012 <on_reboot>restart</on_reboot>#012 <on_crash>destroy</on_crash>#012 <devices>#012 <emulator>/usr/bin/qemu-system-x86_64</emulator>#012 <disk type='file' device='disk'>#012 <driver name='qemu' type='qcow2' cache='none'/>#012 <source file='/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/disk'/>#012 <target dev='vda' bus='virtio'/>#012 <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>#012 </disk>#012 <disk type='network' device='disk'>#012 <driver name='qemu' type='raw' cache='none'/>#012 <auth username='volumes'>#012 <secret type='ceph' uuid='62d0b384-5
Jul 23 17:17:18 slave2 2013-07-23 17:17:18.410 ERROR nova.compute.manager [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] Cannot reboot instance: internal error rbd username 'volumes' specified but secret not found#033[00m
I had setup virsh secret as given in ceph- openstack documentation . How can I verify it ?.
On Tue, Jul 23, 2013 at 4:49 PM, johnu <johnugeorge109@xxxxxxxxx> wrote:
There is a hidden bug which I couldn't reproduce. I was using devstack for openstack and I enabled syslog option for getting nova and cinder logs . After reboot, Everything was fine. I was able to create volumes and I verified in rados.Another thing I noticed is, I don't have cinder user as in devstack script. Hence, I didn't change owner permissions for keyring files and they are owned by root. But, it works fine though
On Tue, Jul 23, 2013 at 6:19 AM, Sebastien Han <sebastien.han@xxxxxxxxxxxx> wrote:Can you send your ceph.conf too?Is /etc/ceph/ceph.conf present? Is the key of user volume present too?––––
Sébastien Han
Cloud Engineer
"Always give 100%. Unless you're giving blood."Email : sebastien.han@xxxxxxxxxxxx – Skype : han.sbastienAddress : 10, rue de la Victoire – 75009 ParisWeb : www.enovance.com – Twitter : @enovanceHi,
I have a three node ceph cluster. ceph -w says health ok . I have openstack in the same cluster and trying to map cinder and glance onto rbd.
I have followed steps as given in http://ceph.com/docs/next/rbd/rbd-openstack/
New Settings that is added in cinder.conf for three files
volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_pool=volumes
glance_api_version=2
rbd_user=volumes
rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each node)
LOGS seen when I run ./rejoin.sh
2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers
2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 2290
2013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume node (version 2013.2)
2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to ceph cluster
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most recent call last):
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in check_for_setup_error
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd with RADOSClient(self):
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool)
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in _connect_to_rados
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd client.connect()
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd raise make_ex(ret, "error calling connect")
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect
2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd
2013-07-22 20:35:02.149 ERROR cinder.service [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception
2013-07-22 20:35:02.149 TRACE cinder.service Traceback (most recent call last):
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 228, in _start_child
2013-07-22 20:35:02.149 TRACE cinder.service self._child_process(wrap.server)
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 205, in _child_process
2013-07-22 20:35:02.149 TRACE cinder.service launcher.run_server(server)
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 96, in run_server
2013-07-22 20:35:02.149 TRACE cinder.service server.start()
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 359, in start
2013-07-22 20:35:02.149 TRACE cinder.service self.manager.init_host()
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/manager.py", line 139, in init_host
2013-07-22 20:35:02.149 TRACE cinder.service self.driver.check_for_setup_error()
2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 248, in check_for_setup_error
2013-07-22 20:35:02.149 TRACE cinder.service raise exception.VolumeBackendAPIException(data=""> 2013-07-22 20:35:02.149 TRACE cinder.service VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster
2013-07-22 20:35:02.149 TRACE cinder.service
2013-07-22 20:35:02.191 INFO cinder.service [-] Child 2290 exited with status 2
2013-07-22 20:35:02.192 INFO cinder.service [-] _wait_child 1
2013-07-22 20:35:02.193 INFO cinder.service [-] wait wrap.failed True
Can someone help me with some debug points and solve it ?_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com