Just to take away a possible issue from infra (LBs etc). Did you try to download the image on the compute node? Something like rbd export? > On 04 Sep 2015, at 11:56, Vasiliy Angapov <angapov@xxxxxxxxx> wrote: > > Hi all, > > Not sure actually where does this bug belong to - OpenStack or Ceph - > but writing here in humble hope that anyone faced that issue also. > > I configured test OpenStack instance with Glance images stored in Ceph > 0.94.3. Nova has local storage. > But when I'm trying to launch instance from large image stored in Ceph > - it fails to spawn with such an error in nova-conductor.log: > > 2015-09-04 11:52:35.076 3605449 ERROR nova.scheduler.utils > [req-c6af3eca-f166-45bd-8edc-b8cfadeb0d0b > 82c1f134605e4ee49f65015dda96c79a 448cc6119e514398ac2793d043d4fa02 - - > -] [instance: 18c9f1d5-50e8-426f-94d5-167f43129ea6] Error from last > host: slpeah005 (node slpeah005.cloud): [u'Traceback (most recent call > last):\n', u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2220, > in _do_build_and_run_instance\n filter_properties)\n', u' File > "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2363, > in _build_and_run_instance\n instance_uuid=instance.uuid, > reason=six.text_type(e))\n', u'RescheduledException: Build of instance > 18c9f1d5-50e8-426f-94d5-167f43129ea6 was re-scheduled: [Errno 32] > Corrupt image download. Checksum was 625d0686a50f6b64e57b1facbc042248 > expected 4a7de2fbbd01be5c6a9e114df145b027\n'] > > So nova tries 3 different hosts with the same error messages on every > single one and then fails to spawn an instance. > I've tried Cirros little image and it works fine with it. Issue > happens with large images like 10Gb in size. > I also managed to look into /var/lib/nova/instances/_base folder and > found out that image is actually being downloaded but at some moment > the download process interrupts for some unknown reason and instance > gets deleted. > > I looked at the syslog and found many messages like that: > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735094 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.22 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735099 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.23 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735104 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.24 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735108 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.26 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > Sep 4 12:51:37 slpeah003 ceph-osd: 2015-09-04 12:51:37.735118 > 7f092dfd1700 -1 osd.3 3025 heartbeat_check: no reply from osd.27 since > back 2015-09-04 12:51:31.834203 front 2015-09-04 12:51:31.834203 > (cutoff 2015-09-04 12:51:32.735011) > > I've also tried to monitor nova-compute process file descriptors > number but it is never more than 102. ("echo > /proc/NOVA_COMPUTE_PID/fd/* | wc -w" like Jan advised in this ML). > It also seems like problem appeared only in 0.94.3, in 0.94.2 > everything worked just fine! > > Would be very grateful for any help! > > Vasily. > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com Cheers. –––– Sébastien Han Senior Cloud Architect "Always give 100%. Unless you're giving blood." Mail: seb@xxxxxxxxxx Address: 11 bis, rue Roquépine - 75008 Paris
Attachment:
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com