Are there any extra directories in /var/lib/ceph or /var/lib/ceph/<fsid> that appear to be for those OSDs on that host? When cephadm builds the info it uses for "ceph orch ps" it's actually scraping those directories. The output of "cephadm ls" on the host with the duplicates could also potentially have some insights. On Thu, Sep 1, 2022 at 12:15 PM Satish Patel <satish.txt@xxxxxxxxx> wrote: > Folks, > > I am playing with cephadm and life was good until I started upgrading from > octopus to pacific. My upgrade process stuck after upgrading mgr and in > logs now i can see following error > > root@ceph1:~# ceph log last cephadm > 2022-09-01T14:40:45.739804+0000 mgr.ceph2.hmbdla (mgr.265806) 8 : > cephadm [INF] Deploying daemon grafana.ceph1 on ceph1 > 2022-09-01T14:40:56.115693+0000 mgr.ceph2.hmbdla (mgr.265806) 14 : > cephadm [INF] Deploying daemon prometheus.ceph1 on ceph1 > 2022-09-01T14:41:11.856725+0000 mgr.ceph2.hmbdla (mgr.265806) 25 : > cephadm [INF] Reconfiguring alertmanager.ceph1 (dependencies > changed)... > 2022-09-01T14:41:11.861535+0000 mgr.ceph2.hmbdla (mgr.265806) 26 : > cephadm [INF] Reconfiguring daemon alertmanager.ceph1 on ceph1 > 2022-09-01T14:41:12.927852+0000 mgr.ceph2.hmbdla (mgr.265806) 27 : > cephadm [INF] Reconfiguring grafana.ceph1 (dependencies changed)... > 2022-09-01T14:41:12.940615+0000 mgr.ceph2.hmbdla (mgr.265806) 28 : > cephadm [INF] Reconfiguring daemon grafana.ceph1 on ceph1 > 2022-09-01T14:41:14.056113+0000 mgr.ceph2.hmbdla (mgr.265806) 33 : > cephadm [INF] Found duplicate OSDs: osd.2 in status running on ceph1, > osd.2 in status running on ceph2 > 2022-09-01T14:41:14.056437+0000 mgr.ceph2.hmbdla (mgr.265806) 34 : > cephadm [INF] Found duplicate OSDs: osd.5 in status running on ceph1, > osd.5 in status running on ceph2 > 2022-09-01T14:41:14.056630+0000 mgr.ceph2.hmbdla (mgr.265806) 35 : > cephadm [INF] Found duplicate OSDs: osd.3 in status running on ceph1, > osd.3 in status running on ceph2 > > > Not sure from where duplicate names came and how that happened. In > following output i can't see any duplication > > root@ceph1:~# ceph osd tree > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF > -1 0.97656 root default > -3 0.48828 host ceph1 > 4 hdd 0.09769 osd.4 up 1.00000 1.00000 > 0 ssd 0.19530 osd.0 up 1.00000 1.00000 > 1 ssd 0.19530 osd.1 up 1.00000 1.00000 > -5 0.48828 host ceph2 > 5 hdd 0.09769 osd.5 up 1.00000 1.00000 > 2 ssd 0.19530 osd.2 up 1.00000 1.00000 > 3 ssd 0.19530 osd.3 up 1.00000 1.00000 > > > But same time i can see duplicate OSD number in ceph1 and ceph2 > > > root@ceph1:~# ceph orch ps > NAME HOST PORTS STATUS REFRESHED AGE > MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID > alertmanager.ceph1 ceph1 *:9093,9094 running (20s) 2s ago 20s > 17.1M - ba2b418f427c 856a4fe641f1 > alertmanager.ceph1 ceph2 *:9093,9094 running (20s) 3s ago 20s > 17.1M - ba2b418f427c 856a4fe641f1 > crash.ceph2 ceph1 running (12d) 2s ago 12d > 10.0M - 15.2.17 93146564743f 0a009254afb0 > crash.ceph2 ceph2 running (12d) 3s ago 12d > 10.0M - 15.2.17 93146564743f 0a009254afb0 > grafana.ceph1 ceph1 *:3000 running (18s) 2s ago 19s > 47.9M - 8.3.5 dad864ee21e9 7d7a70b8ab7f > grafana.ceph1 ceph2 *:3000 running (18s) 3s ago 19s > 47.9M - 8.3.5 dad864ee21e9 7d7a70b8ab7f > mgr.ceph2.hmbdla ceph1 running (13h) 2s ago 12d > 506M - 16.2.10 0d668911f040 6274723c35f7 > mgr.ceph2.hmbdla ceph2 running (13h) 3s ago 12d > 506M - 16.2.10 0d668911f040 6274723c35f7 > node-exporter.ceph2 ceph1 running (91m) 2s ago 12d > 60.7M - 0.18.1 e5a616e4b9cf d0ba04bb977c > node-exporter.ceph2 ceph2 running (91m) 3s ago 12d > 60.7M - 0.18.1 e5a616e4b9cf d0ba04bb977c > osd.2 ceph1 running (12h) 2s ago 12d > 867M 4096M 15.2.17 93146564743f e286fb1c6302 > osd.2 ceph2 running (12h) 3s ago 12d > 867M 4096M 15.2.17 93146564743f e286fb1c6302 > osd.3 ceph1 running (12h) 2s ago 12d > 978M 4096M 15.2.17 93146564743f d3ae5d9f694f > osd.3 ceph2 running (12h) 3s ago 12d > 978M 4096M 15.2.17 93146564743f d3ae5d9f694f > osd.5 ceph1 running (12h) 2s ago 8d > 225M 4096M 15.2.17 93146564743f 405068fb474e > osd.5 ceph2 running (12h) 3s ago 8d > 225M 4096M 15.2.17 93146564743f 405068fb474e > prometheus.ceph1 ceph1 *:9095 running (8s) 2s ago 8s > 30.4M - 514e6a882f6e 9031dbe30cae > prometheus.ceph1 ceph2 *:9095 running (8s) 3s ago 8s > 30.4M - 514e6a882f6e 9031dbe30cae > > > Is this a bug or did I do something wrong? any workaround to get out > from this condition? > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx