The parent / clone relationship is still established via the snapshot, just not the HEAD revision of the image. Therefore, probably the easiest fix would be to ensure that a rollback would also copy that parent / clone relationship link to the HEAD revision from the snapshot. Basically, since you rolled back past the flatten operation, we would also want the rollback operation to fully rollback the flatten (not just the block objects). -- Jason Dillaman ----- Original Message ----- > From: "wuxingyi" <wuxingyigfs@xxxxxxxxxxx> > To: dillaman@xxxxxxxxxx > Cc: ceph-users@xxxxxxxxxxxxxx > Sent: Thursday, January 28, 2016 4:52:57 AM > Subject: RE: data loss when flattening a cloned image on giant > > Thank you for your quick reply :) > > The first object of the cloned image has already lost after flattening, so it > may be too late to restore the parent relationship during the rollback > operation. > > > ---------------------------------------- > > Date: Tue, 26 Jan 2016 09:50:56 -0500 > > From: dillaman@xxxxxxxxxx > > To: wuxingyigfs@xxxxxxxxxxx > > CC: ceph-users@xxxxxxxxxxxxxx; wuxingyi@xxxxxxxx > > Subject: Re: data loss when flattening a cloned image on giant > > > > Interesting find. This is an interesting edge case interaction between > > snapshot, flatten, and rollback. I believe this was unintentionally fixed > > by the deep-flatten feature added to infernalis. Probably the simplest fix > > for giant would be to restore the parent image link during the rollback > > (since that link is still established via the snapshot). > > > > -- > > > > Jason Dillaman > > > > > > ----- Original Message ----- > >> From: "wuxingyi" <wuxingyigfs@xxxxxxxxxxx> > >> To: ceph-users@xxxxxxxxxxxxxx > >> Cc: wuxingyi@xxxxxxxx > >> Sent: Tuesday, January 26, 2016 3:11:11 AM > >> Subject: Re: data loss when flattening a cloned image on > >> giant > >> > >> really sorry for the bad format, I will put it here again. > >> > >> I found data lost when flattening a cloned image on giant(0.87.2). The > >> problem can be easily reproduced by runing the following script: > >> #!/bin/bash > >> ceph osd pool create wuxingyi 1 1 > >> rbd create --image-format 2 wuxingyi/disk1.img --size 8 > >> #writing "FOOBAR" at offset 0 > >> python writetooffset.py disk1.img 0 FOOBAR > >> rbd snap create wuxingyi/disk1.img@SNAPSHOT > >> rbd snap protect wuxingyi/disk1.img@SNAPSHOT > >> > >> echo "start cloing" > >> rbd clone wuxingyi/disk1.img@SNAPSHOT wuxingyi/CLONEIMAGE > >> > >> #writing "WUXINGYI" at offset 4M of cloned image > >> python writetooffset.py CLONEIMAGE $((4*1048576)) WUXINGYI > >> rbd snap create wuxingyi/CLONEIMAGE@CLONEDSNAPSHOT > >> > >> #modify at offset 4M of cloned image > >> python writetooffset.py CLONEIMAGE $((4*1048576)) HEHEHEHE > >> > >> echo "start flattening CLONEIMAGE" > >> rbd flatten wuxingyi/CLONEIMAGE > >> > >> echo "before rollback" > >> rbd export wuxingyi/CLONEIMAGE && hexdump -C CLONEIMAGE > >> rm CLONEIMAGE -f > >> rbd snap rollback wuxingyi/CLONEIMAGE@CLONEDSNAPSHOT > >> echo "after rollback" > >> rbd export wuxingyi/CLONEIMAGE && hexdump -C CLONEIMAGE > >> rm CLONEIMAGE -f > >> > >> > >> where writetooffset.py is a simple python script writing specific data to > >> the > >> specific offset of the image: > >> #!/usr/bin/python > >> #coding=utf-8 > >> import sys > >> import rbd > >> import rados > >> > >> cluster = rados.Rados(conffile='/etc/ceph/ceph.conf') > >> cluster.connect() > >> ioctx = cluster.open_ioctx('wuxingyi') > >> rbd_inst = rbd.RBD() > >> image=rbd.Image(ioctx, sys.argv[1]) > >> image.write(sys.argv[3], int(sys.argv[2])) > >> > >> The output is something like: > >> > >> before rollback > >> Exporting image: 100% complete...done. > >> 00000000 46 4f 4f 42 41 52 00 00 00 00 00 00 00 00 00 00 > >> |FOOBAR..........| > >> 00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >> |................| > >> * > >> 00400000 48 45 48 45 48 45 48 45 00 00 00 00 00 00 00 00 > >> |HEHEHEHE........| > >> 00400010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >> |................| > >> * > >> 00800000 > >> Rolling back to snapshot: 100% complete...done. > >> after rollback > >> Exporting image: 100% complete...done. > >> 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >> |................| > >> * > >> 00400000 57 55 58 49 4e 47 59 49 00 00 00 00 00 00 00 00 > >> |WUXINGYI........| > >> 00400010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > >> |................| > >> * > >> 00800000 > >> > >> > >> We can easily fount that the first object of the image is definitely lost, > >> and I found the data loss is happened when flattening, there is only a > >> "head" version of the first object, actually a "snapid" version of the > >> object should also be created and writed when flattening. > >> But when running this scripts on upstream code, I cannot hit this problem. > >> I > >> look through the upstream code but could not find which commit fixes this > >> bug. I also found the whole state machine dealing with RBD layering > >> changed > >> a lot since giant release. > >> > >> Could you please give me some hints on which commits should I backport? > >> Thanks~~~~ > >> _______________________________________________ > >> ceph-users mailing list > >> ceph-users@xxxxxxxxxxxxxx > >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com