On Thu, Apr 11, 2019 at 12:07 AM Brayan Perera <brayan.perera@xxxxxxxxx> wrote: > > Dear Jason, > > > Thanks for the reply. > > We are using python 2.7.5 > > Yes. script is based on openstack code. > > As suggested, we have tried chunk_size 32 and 64, and both giving same > incorrect checksum value. > > We tried to copy same image in different pool and resulted same > incorrect checksum. My best guess is that there is some odd encoding format issues between the raw byte stream and Python strings. Can you tweak your Python code to generate a md5sum for each chunk (let's say 4MiB per chunk to match the object size) and compare that to a 4MiB chunked "md5sum" CLI results from the associated "rbd export" data file (split -b 4194304 --filter=md5sum). That will allow you to isolate the issue down to a specific section of the image. > > Thanks & Regards, > Brayan > > On Wed, Apr 10, 2019 at 6:21 PM Jason Dillaman <jdillama@xxxxxxxxxx> wrote: > > > > On Wed, Apr 10, 2019 at 1:46 AM Brayan Perera <brayan.perera@xxxxxxxxx> wrote: > > > > > > Dear All, > > > > > > Ceph Version : 12.2.5-2.ge988fb6.el7 > > > > > > We are facing an issue on glance which have backend set to ceph, when > > > we try to create an instance or volume out of an image, it throws > > > checksum error. > > > When we use rbd export and use md5sum, value is matching with glance checksum. > > > > > > When we use following script, it provides same error checksum as glance. > > > > What version of Python are you using? > > > > > We have used below images for testing. > > > 1. Failing image (checksum mismatch): ffed4088-74e1-4f22-86cb-35e7e97c377c > > > 2. Passing image (checksum identical): c048f0f9-973d-4285-9397-939251c80a84 > > > > > > Output from storage node: > > > > > > 1. Failing image: ffed4088-74e1-4f22-86cb-35e7e97c377c > > > checksum from glance database: 34da2198ec7941174349712c6d2096d8 > > > [root@storage01moc ~]# python test_rbd_format.py > > > ffed4088-74e1-4f22-86cb-35e7e97c377c admin > > > Image size: 681181184 > > > checksum from ceph: b82d85ae5160a7b74f52be6b5871f596 > > > Remarks: checksum is different > > > > > > 2. Passing image: c048f0f9-973d-4285-9397-939251c80a84 > > > checksum from glance database: 4f977f748c9ac2989cff32732ef740ed > > > [root@storage01moc ~]# python test_rbd_format.py > > > c048f0f9-973d-4285-9397-939251c80a84 admin > > > Image size: 1411121152 > > > checksum from ceph: 4f977f748c9ac2989cff32732ef740ed > > > Remarks: checksum is identical > > > > > > Wondering whether this issue is from ceph python libs or from ceph itself. > > > > > > Please note that we do not have ceph pool tiering configured. > > > > > > Please let us know whether anyone faced similar issue and any fixes for this. > > > > > > test_rbd_format.py > > > =================================================== > > > import rados, sys, rbd > > > > > > image_id = sys.argv[1] > > > try: > > > rados_id = sys.argv[2] > > > except: > > > rados_id = 'openstack' > > > > > > > > > class ImageIterator(object): > > > """ > > > Reads data from an RBD image, one chunk at a time. > > > """ > > > > > > def __init__(self, conn, pool, name, snapshot, store, chunk_size='8'): > > > > Am I correct in assuming this was adapted from OpenStack code? That > > 8-byte "chunk" is going to be terribly inefficient to compute a CRC. > > Not that it should matter, but does it still fail if you increase this > > to 32KiB or 64KiB? > > > > > self.pool = pool > > > self.conn = conn > > > self.name = name > > > self.snapshot = snapshot > > > self.chunk_size = chunk_size > > > self.store = store > > > > > > def __iter__(self): > > > try: > > > with conn.open_ioctx(self.pool) as ioctx: > > > with rbd.Image(ioctx, self.name, > > > snapshot=self.snapshot) as image: > > > img_info = image.stat() > > > size = img_info['size'] > > > bytes_left = size > > > while bytes_left > 0: > > > length = min(self.chunk_size, bytes_left) > > > data = image.read(size - bytes_left, length) > > > bytes_left -= len(data) > > > yield data > > > raise StopIteration() > > > except rbd.ImageNotFound: > > > raise exceptions.NotFound( > > > _('RBD image %s does not exist') % self.name) > > > > > > conn = rados.Rados(conffile='/etc/ceph/ceph.conf',rados_id=rados_id) > > > conn.connect() > > > > > > > > > with conn.open_ioctx('images') as ioctx: > > > try: > > > with rbd.Image(ioctx, image_id, > > > snapshot='snap') as image: > > > img_info = image.stat() > > > print "Image size: %s " % img_info['size'] > > > iter, size = (ImageIterator(conn, 'images', image_id, > > > 'snap', 'rbd'), img_info['size']) > > > import six, hashlib > > > md5sum = hashlib.md5() > > > for chunk in iter: > > > if isinstance(chunk, six.string_types): > > > chunk = six.b(chunk) > > > md5sum.update(chunk) > > > md5sum = md5sum.hexdigest() > > > print "checksum from ceph: " + md5sum > > > except: > > > raise > > > =================================================== > > > > > > > > > Thank You ! > > > > > > -- > > > Best Regards, > > > Brayan Perera > > > _______________________________________________ > > > ceph-users mailing list > > > ceph-users@xxxxxxxxxxxxxx > > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > > > > -- > > Jason > > > > -- > Best Regards, > Brayan Perera -- Jason _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com