Re: Nova fails to download image from Glance backed with Ceph

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Didn't you run out of space? Happened to me when a customer tried to create a 1TB image...

Z.

> On 04 Sep 2015, at 15:15, Sebastien Han <seb@xxxxxxxxxx> wrote:
> 
> Just to take away a possible issue from infra (LBs etc).
> Did you try to download the image on the compute node? Something like rbd export?
> 
>> On 04 Sep 2015, at 11:56, Vasiliy Angapov <angapov@xxxxxxxxx> wrote:
>> 
>> Hi all,
>> 
>> Not sure actually where does this bug belong to - OpenStack or Ceph -
>> but writing here in humble hope that anyone faced that issue also.
>> 
>> I configured test OpenStack instance with Glance images stored in Ceph
>> 0.94.3. Nova has local storage.
>> But when I'm trying to launch instance from large image stored in Ceph
>> - it fails to spawn with such an error in nova-conductor.log:
>> 
>> 2015-09-04 11:52:35.076 3605449 ERROR nova.scheduler.utils
>> [req-c6af3eca-f166-45bd-8edc-b8cfadeb0d0b
>> 82c1f134605e4ee49f65015dda96c79a 448cc6119e514398ac2793d043d4fa02 - -
>> -] [instance: 18c9f1d5-50e8-426f-94d5-167f43129ea6] Error from last
>> host: slpeah005 (node slpeah005.cloud): [u'Traceback (most recent call
>> last):\n', u'  File
>> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2220,
>> in _do_build_and_run_instance\n    filter_properties)\n', u'  File
>> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2363,
>> in _build_and_run_instance\n    instance_uuid=instance.uuid,
>> reason=six.text_type(e))\n', u'RescheduledException: Build of instance
>> 18c9f1d5-50e8-426f-94d5-167f43129ea6 was re-scheduled: [Errno 32]
>> Corrupt image download. Checksum was 625d0686a50f6b64e57b1facbc042248
>> expected 4a7de2fbbd01be5c6a9e114df145b027\n']
>> 
>> So nova tries 3 different hosts with the same error messages on every
>> single one and then fails to spawn an instance.
>> I've tried Cirros little image and it works fine with it. Issue
>> happens with large images like 10Gb in size.
>> I also managed to look into /var/lib/nova/instances/_base folder and
>> found out that image is actually being downloaded but at some moment
>> the download process interrupts for some unknown reason and instance
>> gets deleted.
>> 
>> I looked at the syslog and found many messages like that:
>> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735094
>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.22 since
>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
>> (cutoff 2015-09-04 12:51:32.735011)
>> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735099
>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.23 since
>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
>> (cutoff 2015-09-04 12:51:32.735011)
>> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735104
>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.24 since
>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
>> (cutoff 2015-09-04 12:51:32.735011)
>> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735108
>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.26 since
>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
>> (cutoff 2015-09-04 12:51:32.735011)
>> Sep  4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735118
>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.27 since
>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203
>> (cutoff 2015-09-04 12:51:32.735011)
>> 
>> I've also tried to monitor nova-compute process file descriptors
>> number but it is never more than 102. ("echo
>> /proc/NOVA_COMPUTE_PID/fd/* | wc -w" like Jan advised in this ML).
>> It also seems like problem appeared only in 0.94.3, in 0.94.2
>> everything worked just fine!
>> 
>> Would be very grateful for any help!
>> 
>> Vasily.
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> Cheers.
> ––––
> Sébastien Han
> Senior Cloud Architect
> 
> "Always give 100%. Unless you're giving blood."
> 
> Mail: seb@xxxxxxxxxx
> Address: 11 bis, rue Roquépine - 75008 Paris
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux