> -----Original Message----- > From: Dałek, Piotr [mailto:Piotr.Dalek@xxxxxxxxxxxxxx] > Sent: Wednesday, August 26, 2015 2:02 AM > To: Sage Weil; Deneau, Tom > Cc: ceph-devel@xxxxxxxxxxxxxxx; ceph-users@xxxxxxxx > Subject: RE: rados bench object not correct errors on v9.0.3 > > > -----Original Message----- > > From: ceph-devel-owner@xxxxxxxxxxxxxxx [mailto:ceph-devel- > > owner@xxxxxxxxxxxxxxx] On Behalf Of Sage Weil > > Sent: Tuesday, August 25, 2015 7:43 PM > > > > I have built rpms from the tarball http://ceph.com/download/ceph- > > 9.0.3.tar.bz2. > > > Have done this for fedora 21 x86_64 and for aarch64. On both > > > platforms when I run a single node "cluster" with a few osds and run > > > rados bench read tests (either seq or rand) I get occasional reports > > > like > > > > > > benchmark_data_myhost_20729_object73 is not correct! > > > > > > I never saw these with similar rpm builds on these platforms from > > > 9.0.2 > > sources. > > > > > > Also, if I go to an x86-64 system running Ubuntu trusty for which I > > > am able to install prebuilt binary packages via > > > ceph-deploy install --dev v9.0.3 > > > > > > I do not see the errors there. > > > > Hrm.. haven't seen it on this end, but we're running/testing master > > and not > > 9.0.2 specifically. If you can reproduce this on master, that'd be very > helpful! > > > > There have been some recent changes to rados bench... Piotr, does this > > seem like it might be caused by your changes? > > Yes. My PR #4690 (https://github.com/ceph/ceph/pull/4690) caused rados bench > to be fast enough to sometimes run into race condition between librados's AIO > and objbencher processing. That was fixed in PR #5152 > (https://github.com/ceph/ceph/pull/5152) which didn't make it into 9.0.3. > Tom, you can confirm this by inspecting the contents of objects questioned > (their contents should be perfectly fine and I in line with other objects). > In the meantime you can either apply patch from PR #5152 on your own or use - > -no-verify. > > With best regards / Pozdrawiam > Piotr Dałek Piotr -- Thank you. Yes, when I looked at the contents of the objects they always looked correct. And yes a single object would sometimes report an error and sometimes not. So a race condition makes sense. A couple of questions: * Why would I not see this behavior using the pre-built 9.0.3 binaries that get installed using "ceph-deploy install --dev v9.0.3"? I would assume this is built from the same sources as the 9.0.3 tarball. * So I assume one should not compare pre 9.0.3 rados bench numbers with 9.0.3 and after? The pull request https://github.com/ceph/ceph/pull/4690 did not mention the effect on final bandwidth numbers, did you notice what that effect was? -- Tom -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html