On 06/30/2017 05:21 AM, Nathan Cutler wrote:
Hi again Josh: Here comes the recap of 10.2.8 status. All requested PRs have been merged, so I ran a "rados" suite and an "upgrade/hammer-x" suite. There were some failures, but most were obvious infrastructure noise and disappeared on re-run. There were also two failures that stood out and seemed like they might be real bugs: [1] rados issue: http://tracker.ceph.com/issues/20449 [2] upgrade issue: http://tracker.ceph.com/issues/13381 Regarding [1], my initial analysis was wrong. The real cause of the failure is a transient gevent/greenlet timeout; happens in about 40% of the runs. Regarding [2], Sage analyzed the initial failure in the tracker. I re-ran the test on both smithi and vps with the following results: * on vps, both jobs passed * on smithi, one job passed and the other failed. However, the failure was for a different reason ("ceph-objectstore-tool: exp list-pgs failure with status 1") - see http://pulpito.front.sepia.ceph.com/smithfarm-2017-06-30_09:08:18-upgrade:hammer-x-wip-jewel-backports-distro-basic-smithi/ * http://tracker.ceph.com/issues/13381 was not reproduced What other testing do you think is needed before we send 10.2.8 to QE?
What you've run already looks sufficient - I was worried about 13381 before, but it does not seem related to the 10.2.8 backports at this point - just a pre-existing rare race. So I'd say it's ready for QE. Thanks! Josh -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html