答复: Rbd export-diff bug? rbd export-diff generates different incremental files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, everyone.

I read the source code.  Could this be a case: a "WRITE" op designated to OBJECT X is followed by a series of Ops at the end of which is a "READ" op designated to the same OBJECT that come from the "rbd EXPORT" command; although the "WRITE" op modified the ObjectContext of OBJECT X to add the new "snap" object, the modified obc is erased from the SharedLRU cache "object_contexts" before the "WRITE" op is written to underlying file system by threads of filestore and before the "READ" op finds its obc, in which case, if, also before the "WRITE" op is executed, the "READ" op try to find its obc from the underlying file system, it would get the "out dated" obc which points to the "HEAD" object of OBJECT X if no other modification designated to OBJECT  X  is executed after the snapshot is created. If this is possible, then the result would be a non-consistent snapshot view.

Is this correct?

???: ceph-users [mailto:ceph-users-bounces at lists.ceph.com] ?? Zhongyan Gu
????: 2017?2?20? 18:47
???: ceph-users; Jason Dillaman; sweil at redhat.com
??: Re: [ceph-users] Rbd export-diff bug? rbd export-diff generates different incremental files

Could this ?be a ?synchronization issue in which case multi clients ?visiting the same object, one client(the vm/qemu) is updating the object while another client(ceph rbd export/export-diff execution) is reading the content of the same object? How do Ceph make sure the consistency in this case?

Zhongyan

On Mon, Feb 20, 2017 at 11:21 AM, Zhongyan Gu <zhongyan.gu at gmail.com> wrote:
BTW, we used hammer ?version with the following fix. the issue is also reported by us during the former backup testing.
https://github.com/ceph/ceph/pull/12218/files
librbd: diffs to clone's first snapshot should include parent diffs


Zhongyan

On Mon, Feb 20, 2017 at 11:13 AM, Zhongyan Gu <zhongyan.gu at gmail.com> wrote:

Hi Sage and Jason,
My company is building backup system based on rbd export-diff and import-diff cmds.
However, in recent test we found some strange behaviors of cmd export-diff. long words in short: sometimes repeatedly executing rbd export-diff ?from-snap snap1 image at snap2 -|md5sum, and md5sum returns different values.
The details are:
We used two ceph rbd clusters: A for online vms usage and B for backup usage.
For a specific vm image, this image is cloned from a parent image. And initially our backup system will do a full backup with rbd export/import cmds. Then every day we will do incremental backup with rbd export-diff/import-diff cmds.
The make sure the data consistency, we also do the md5 comparison of online vm images at snapN and backup vm images at snapN.
Our test found some times for some vm images the md5 check is failed: online vm images at snapN doesn?t match backup vm images at snapN.
To narrow this issue, we manually generated the incremental file generated by rbd export-diff between the specific snaps and found its md5 didn?t match the file generated by backup scripits.
Compared those two binary files we found only a little difference: some bytes are not the same.
I doubt could this be an export-diff bug? As far as I know, if we create two snaps, then the diffs between two snaps should always be the same. But why export-diff doesn?t work as expected and return different md5 check? Some corner case not well considered or anyone else has the same experience? BTW, we did some fio io workload 24 hours in vms during the backup test.
?
Thanks,
Zhongyan




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux