So I ran a job on smithi against teuthology code which is supposed to have "No space left on device":
And it passed, has not this issue. Which exact suite does reproduce the issue?
Kyrylo Shatskyy -- SUSE Software Solutions Germany GmbH Maxfeldstr. 5 90409 Nuremberg Germany
On Wed, Oct 16, 2019 at 2:39 PM kyr <kshatskyy@xxxxxxx> wrote: I hope the nathans fix can probably do the thing, however it does not cover the the log referenced in the description of https://tracker.ceph.com/issues/42313 because teuthology worker does not include that fix which is supposed to be cause for "No space left on device" issue.
I'm not quite sure what you mean here. I think one of these addressesyour statement?1) we were creating very small OSDs on the root device since thepartitions weren't being mounted, and so these jobs actually filledthem up as a consequence of that.2) most of the teuthology repo is pulled fresh from master on everyrun. The workers themselves require restarting to get updates butthat's pretty rare. (Seehttps://github.com/ceph/teuthology/blob/master/teuthology/worker.py#L82) Can some one give one-job teuthology-suite command that 100% reproducing the issue?
Kyrylo Shatskyy -- SUSE Software Solutions Germany GmbH Maxfeldstr. 5 90409 Nuremberg Germany
On Oct 16, 2019, at 11:14 PM, Nathan Cutler <ncutler@xxxxxxxx> wrote:
On Wed, Oct 16, 2019 at 12:43:32PM -0700, Gregory Farnum wrote:
On Wed, Oct 16, 2019 at 12:24 PM David Galloway <dgallowa@xxxxxxxxxx> wrote:
Yuri just reminded me that he's seeing this problem on the mimic branch.
Does that mean this PR just needs to be backported to all branches?
https://github.com/ceph/ceph/pull/30792
I'd be surprised if that one (changing iteritems() to items()) could cause this, and it's not a fix for any known bugs, just ongoing py3 work.
When I said "that commit" I was referring to https://github.com/ceph/teuthology/commit/41a13eca480e38cfeeba7a180b4516b90598c39b, which is in the teuthology repo and thus hits every test run. Looking at the comments across https://github.com/ceph/teuthology/pull/1332 and https://tracker.ceph.com/issues/42313 it sounds like that teuthology commit accidentally fixed a bug which triggered another bug that we're not sure how to resolve, but perhaps I'm misunderstanding?
I think I understand what's going on. Here's an interim fix: https://github.com/ceph/teuthology/pull/1334
Assuming this PR really does fix the issue, the "real" fix will be to drop get_wwn_id_map altogether, since it has long outlived its usefulness ( see https://tracker.ceph.com/issues/14855 ).
Nathan _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx
_______________________________________________Dev mailing list -- dev@xxxxxxxTo unsubscribe send an email to dev-leave@xxxxxxx
|
_______________________________________________
Dev mailing list -- dev@xxxxxxx
To unsubscribe send an email to dev-leave@xxxxxxx