On Wed, Oct 16, 2019 at 5:39 PM kyr <kshatskyy@xxxxxxx> wrote: > > I hope the nathans fix can probably do the thing, however it does not cover the the log referenced in the description of https://tracker.ceph.com/issues/42313 because teuthology worker does not include that fix which is supposed to be cause for "No space left on device" issue. If teuthology is picking the wrong devices for the past week (as described by Nathan's PR), why wouldn't you expect that would impact OSDs running out of space? > Can some one give one-job teuthology-suite command that 100% reproducing the issue? I believe that's the goal to test Nathan's change against the RBD suite since it has been hitting it across all branches since the end of last week at least. The logs prior to this breakage show ... 2019-10-02T21:37:45.198 INFO:tasks.ceph:fs option selected, checking for scratch devs 2019-10-02T21:37:45.199 INFO:tasks.ceph:found devs: ['/dev/vg_nvme/lv_4', '/dev/vg_nvme/lv_3', '/dev/vg_nvme/lv_2', '/dev/vg_nvme/lv_1'] 2019-10-02T21:37:45.199 INFO:teuthology.orchestra.run.smithi197:Running: 2019-10-02T21:37:45.199 INFO:teuthology.orchestra.run.smithi197:> ls -l '/dev/disk/by-id/wwn-*' 2019-10-02T21:37:45.265 INFO:teuthology.orchestra.run.smithi197.stderr:ls: cannot access /dev/disk/by-id/wwn-*: No such file or directory 2019-10-02T21:37:45.265 DEBUG:teuthology.orchestra.run:got remote process result: 2 2019-10-02T21:37:45.266 INFO:teuthology.misc:Failed to get wwn devices! Using /dev/sd* devices... ... and after ... 2019-10-10T19:42:32.759 INFO:tasks.ceph:fs option selected, checking for scratch devs 2019-10-10T19:42:32.759 INFO:tasks.ceph:found devs: ['/dev/vg_nvme/lv_4', '/dev/vg_nvme/lv_3', '/dev/vg_nvme/lv_2', '/dev/vg_nvme/lv_1'] 2019-10-10T19:42:32.759 INFO:teuthology.orchestra.run.smithi177:Running: 2019-10-10T19:42:32.759 INFO:teuthology.orchestra.run.smithi177:> ls -l /dev/disk/by-id/wwn-* 2019-10-10T19:42:32.813 INFO:teuthology.orchestra.run.smithi177.stdout:lrwxrwxrwx. 1 root root 9 Oct 10 19:33 /dev/disk/by-id/wwn-0x5000c5009294113e -> ../../sda 2019-10-10T19:42:32.813 INFO:teuthology.orchestra.run.smithi177.stdout:lrwxrwxrwx. 1 root root 10 Oct 10 19:33 /dev/disk/by-id/wwn-0x5000c5009294113e-part1 -> ../../sda1 2019-10-10T19:42:32.813 INFO:tasks.ceph:dev map: {} ... so it seems like a good candidate fix. > Kyrylo Shatskyy > -- > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5 > 90409 Nuremberg > Germany > > > On Oct 16, 2019, at 11:14 PM, Nathan Cutler <ncutler@xxxxxxxx> wrote: > > On Wed, Oct 16, 2019 at 12:43:32PM -0700, Gregory Farnum wrote: > > On Wed, Oct 16, 2019 at 12:24 PM David Galloway <dgallowa@xxxxxxxxxx> wrote: > > > Yuri just reminded me that he's seeing this problem on the mimic branch. > > Does that mean this PR just needs to be backported to all branches? > > https://github.com/ceph/ceph/pull/30792 > > > I'd be surprised if that one (changing iteritems() to items()) could > cause this, and it's not a fix for any known bugs, just ongoing py3 > work. > > When I said "that commit" I was referring to > https://github.com/ceph/teuthology/commit/41a13eca480e38cfeeba7a180b4516b90598c39b, > which is in the teuthology repo and thus hits every test run. Looking > at the comments across https://github.com/ceph/teuthology/pull/1332 > and https://tracker.ceph.com/issues/42313 it sounds like that > teuthology commit accidentally fixed a bug which triggered another bug > that we're not sure how to resolve, but perhaps I'm misunderstanding? > > > I think I understand what's going on. Here's an interim fix: > https://github.com/ceph/teuthology/pull/1334 > > Assuming this PR really does fix the issue, the "real" fix will be to drop > get_wwn_id_map altogether, since it has long outlived its usefulness ( see > https://tracker.ceph.com/issues/14855 ). > > Nathan > _______________________________________________ > Dev mailing list -- dev@xxxxxxx > To unsubscribe send an email to dev-leave@xxxxxxx > > -- Jason _______________________________________________ Dev mailing list -- dev@xxxxxxx To unsubscribe send an email to dev-leave@xxxxxxx