Rbd export-diff bug? rbd export-diff generates different incremental files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason,

Thanks for the reply.

We are not sure this issue is only occurring on cloned images. We think it
would be a generic synchronization issue. Our production/test setup are all
based on Hammer, so we don?t have a chance to touch Jewel. But we will try
Jewel latter.

We don?t use cache tiering in our test environment. Currently we haven?t
found an easy way to reproduce this issue. However, we will continue to
figure out a test case to verify the root cause.

Anyway, if my colleague Xuehan?s root cause analysis is true, then this
would be a serious defect in Ceph?s snapshot mechanism.



You mentioned the fix is scheduled to be included in Hammer 0.94.10, Is
there any fix already there??



Thanks,

Zhongyan



On Mon, Feb 20, 2017 at 9:17 PM, Jason Dillaman <jdillama at redhat.com> wrote:

> AFAIK, that fix is scheduled to be included in Hammer 0.94.10 (which
> hasn't been released yet).
>
> Is this issue only occurring on cloned images? Since Hammer is nearly
> end-of-life, can you repeat this issue on Jewel? Are the affected
> images using cache tiering? Can you determine an easy-to-reproduce
> case?
>
> On Sun, Feb 19, 2017 at 10:21 PM, Zhongyan Gu <zhongyan.gu at gmail.com>
> wrote:
> > BTW, we used hammer  version with the following fix. the issue is also
> > reported by us during the former backup testing.
> > https://github.com/ceph/ceph/pull/12218/files
> > librbd: diffs to clone's first snapshot should include parent diffs
> >
> >
> >
> > Zhongyan
> >
> > On Mon, Feb 20, 2017 at 11:13 AM, Zhongyan Gu <zhongyan.gu at gmail.com>
> wrote:
> >>
> >>
> >> Hi Sage and Jason,
> >>
> >> My company is building backup system based on rbd export-diff and
> >> import-diff cmds.
> >>
> >> However, in recent test we found some strange behaviors of cmd
> >> export-diff. long words in short: sometimes repeatedly executing rbd
> >> export-diff ?from-snap snap1 image at snap2 -|md5sum, and md5sum returns
> >> different values.
> >>
> >> The details are:
> >>
> >> We used two ceph rbd clusters: A for online vms usage and B for backup
> >> usage.
> >>
> >> For a specific vm image, this image is cloned from a parent image. And
> >> initially our backup system will do a full backup with rbd export/import
> >> cmds. Then every day we will do incremental backup with rbd
> >> export-diff/import-diff cmds.
> >>
> >> The make sure the data consistency, we also do the md5 comparison of
> >> online vm images at snapN and backup vm images at snapN.
> >>
> >> Our test found some times for some vm images the md5 check is failed:
> >> online vm images at snapN doesn?t match backup vm images at snapN.
> >>
> >> To narrow this issue, we manually generated the incremental file
> generated
> >> by rbd export-diff between the specific snaps and found its md5 didn?t
> match
> >> the file generated by backup scripits.
> >>
> >> Compared those two binary files we found only a little difference: some
> >> bytes are not the same.
> >>
> >> I doubt could this be an export-diff bug? As far as I know, if we create
> >> two snaps, then the diffs between two snaps should always be the same.
> But
> >> why export-diff doesn?t work as expected and return different md5 check?
> >> Some corner case not well considered or anyone else has the same
> experience?
> >> BTW, we did some fio io workload 24 hours in vms during the backup test.
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Zhongyan
> >
> >
>
>
>
> --
> Jason
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ceph.com/pipermail/ceph-users-ceph.com/attachments/20170221/676a479f/attachment.htm>


[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux