AFAIK, that fix is scheduled to be included in Hammer 0.94.10 (which hasn't been released yet). Is this issue only occurring on cloned images? Since Hammer is nearly end-of-life, can you repeat this issue on Jewel? Are the affected images using cache tiering? Can you determine an easy-to-reproduce case? On Sun, Feb 19, 2017 at 10:21 PM, Zhongyan Gu <zhongyan.gu at gmail.com> wrote: > BTW, we used hammer version with the following fix. the issue is also > reported by us during the former backup testing. > https://github.com/ceph/ceph/pull/12218/files > librbd: diffs to clone's first snapshot should include parent diffs > > > > Zhongyan > > On Mon, Feb 20, 2017 at 11:13 AM, Zhongyan Gu <zhongyan.gu at gmail.com> wrote: >> >> >> Hi Sage and Jason, >> >> My company is building backup system based on rbd export-diff and >> import-diff cmds. >> >> However, in recent test we found some strange behaviors of cmd >> export-diff. long words in short: sometimes repeatedly executing rbd >> export-diff ?from-snap snap1 image at snap2 -|md5sum, and md5sum returns >> different values. >> >> The details are: >> >> We used two ceph rbd clusters: A for online vms usage and B for backup >> usage. >> >> For a specific vm image, this image is cloned from a parent image. And >> initially our backup system will do a full backup with rbd export/import >> cmds. Then every day we will do incremental backup with rbd >> export-diff/import-diff cmds. >> >> The make sure the data consistency, we also do the md5 comparison of >> online vm images at snapN and backup vm images at snapN. >> >> Our test found some times for some vm images the md5 check is failed: >> online vm images at snapN doesn?t match backup vm images at snapN. >> >> To narrow this issue, we manually generated the incremental file generated >> by rbd export-diff between the specific snaps and found its md5 didn?t match >> the file generated by backup scripits. >> >> Compared those two binary files we found only a little difference: some >> bytes are not the same. >> >> I doubt could this be an export-diff bug? As far as I know, if we create >> two snaps, then the diffs between two snaps should always be the same. But >> why export-diff doesn?t work as expected and return different md5 check? >> Some corner case not well considered or anyone else has the same experience? >> BTW, we did some fio io workload 24 hours in vms during the backup test. >> >> >> >> Thanks, >> >> Zhongyan > > -- Jason