Thanks for response! The free space on /var/lib/nova/instances is very large on every compute host. Glance image-download works as expected. 2015-09-04 21:27 GMT+08:00 Jan Schermer <jan@xxxxxxxxxxx>: > Didn't you run out of space? Happened to me when a customer tried to create a 1TB image... > > Z. > >> On 04 Sep 2015, at 15:15, Sebastien Han <seb@xxxxxxxxxx> wrote: >> >> Just to take away a possible issue from infra (LBs etc). >> Did you try to download the image on the compute node? Something like rbd export? >> >>> On 04 Sep 2015, at 11:56, Vasiliy Angapov <angapov@xxxxxxxxx> wrote: >>> >>> Hi all, >>> >>> Not sure actually where does this bug belong to - OpenStack or Ceph - >>> but writing here in humble hope that anyone faced that issue also. >>> >>> I configured test OpenStack instance with Glance images stored in Ceph >>> 0.94.3. Nova has local storage. >>> But when I'm trying to launch instance from large image stored in Ceph >>> - it fails to spawn with such an error in nova-conductor.log: >>> >>> 2015-09-04 11:52:35.076 3605449 ERROR nova.scheduler.utils >>> [req-c6af3eca-f166-45bd-8edc-b8cfadeb0d0b >>> 82c1f134605e4ee49f65015dda96c79a 448cc6119e514398ac2793d043d4fa02 - - >>> -] [instance: 18c9f1d5-50e8-426f-94d5-167f43129ea6] Error from last >>> host: slpeah005 (node slpeah005.cloud): [u'Traceback (most recent call >>> last):\n', u' File >>> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2220, >>> in _do_build_and_run_instance\n filter_properties)\n', u' File >>> "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2363, >>> in _build_and_run_instance\n instance_uuid=instance.uuid, >>> reason=six.text_type(e))\n', u'RescheduledException: Build of instance >>> 18c9f1d5-50e8-426f-94d5-167f43129ea6 was re-scheduled: [Errno 32] >>> Corrupt image download. Checksum was 625d0686a50f6b64e57b1facbc042248 >>> expected 4a7de2fbbd01be5c6a9e114df145b027\n'] >>> >>> So nova tries 3 different hosts with the same error messages on every >>> single one and then fails to spawn an instance. >>> I've tried Cirros little image and it works fine with it. Issue >>> happens with large images like 10Gb in size. >>> I also managed to look into /var/lib/nova/instances/_base folder and >>> found out that image is actually being downloaded but at some moment >>> the download process interrupts for some unknown reason and instance >>> gets deleted. >>> >>> I looked at the syslog and found many messages like that: >>> Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735094 >>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.22 since >>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 >>> (cutoff 2015-09-04 12:51:32.735011) >>> Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735099 >>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.23 since >>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 >>> (cutoff 2015-09-04 12:51:32.735011) >>> Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735104 >>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.24 since >>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 >>> (cutoff 2015-09-04 12:51:32.735011) >>> Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735108 >>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.26 since >>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 >>> (cutoff 2015-09-04 12:51:32.735011) >>> Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735118 >>> 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.27 since >>> back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 >>> (cutoff 2015-09-04 12:51:32.735011) >>> >>> I've also tried to monitor nova-compute process file descriptors >>> number but it is never more than 102. ("echo >>> /proc/NOVA_COMPUTE_PID/fd/* | wc -w" like Jan advised in this ML). >>> It also seems like problem appeared only in 0.94.3, in 0.94.2 >>> everything worked just fine! >>> >>> Would be very grateful for any help! >>> >>> Vasily. >>> _______________________________________________ >>> ceph-users mailing list >>> ceph-users@xxxxxxxxxxxxxx >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> >> Cheers. >> –––– >> Sébastien Han >> Senior Cloud Architect >> >> "Always give 100%. Unless you're giving blood." >> >> Mail: seb@xxxxxxxxxx >> Address: 11 bis, rue Roquépine - 75008 Paris >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@xxxxxxxxxxxxxx >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com