On Wed, Oct 10, 2018 at 2:47 AM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote: > > Hey guys, I'd like to understand how our tests failed in the 13.2.2 > CephFS purge queue issue. > > In terms of the actual code merge flow, it looks like: > > * The original patch passed review and test and merged to master > August 6 in https://github.com/ceph/ceph/pull/22850 > * but was wrong, and reverted a day later. > * Fixed PR merged August 16 in https://github.com/ceph/ceph/pull/23467 > * A backport ticket was created August 21: http://tracker.ceph.com/issues/26989 > * A backport PR was submitted August 28 based on the original broken > code at https://github.com/ceph/ceph/pull/23818 > * It passed review and test and merged on September 12. > > Obviously we can wish review had been more successful, but we can't > rely on that. So I've got some questions: > > * Zheng, how did you discover/notice that the original PR was broken? > Did some tests start failing that previously passed? > I found it during upgrading my test cluster > * Nathan, are there backport tools that make doing the right thing or > wrong thing easier in situations like this, where multiple commits > will claim to Fix the same bug and some of them are wrong? > > And some observations and suggestions about testing: > * Apparently none of our upgrade tests involve running new-code MDS on > a purge queue written by the old code. > * We can and should fix that narrow issue, but it makes me think the > upgrade tests are in general not as robust as one wants. Can the FS > team do an audit? > > * Looking at the issue (new data members were encoded at the front, > and without bumping the struct_v), I think this should have been > detected if we ran it through the ceph-object-corpus dencoder tests > and had the right bits and pieces present there. These tests are part > of "make check" but haven't been updated in quite a while — I think > this bumps the priority of doing that work up since we now have a > specific failure it would have detected early. > > Thanks! Any other thoughts? > -Greg