Hi,
Hmm, could you try and dump the crush map - decompile it - modify it to remove the DNE osd's, compile it and load it back into ceph?
Thanks
On Thu, Dec 29, 2016 at 1:01 PM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:
Hi,
]# ceph osd tree
ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
-7 16.89590 root ssd-disks
-11 0 host ssd1
598798032 0 osd.598798032 DNE 0
21940 0 osd.21940 DNE 0
71 0 osd.71 DNE 0
]# ceph osd rm osd.598798032
Error EINVAL: osd id 598798032 is too largeinvalid osd id-34
]# ceph osd rm osd.21940
osd.21940 does not exist.
]# ceph osd rm osd.71
osd.71 does not exist.
--
> ceph osd rm osd.$ID
> On Thu, Dec 29, 2016 at 10:44 AM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:
> Hi,
> I was trying to delete 3 osds from cluster, deletion procces took very
> long time and I interrupted it. mon process then crushed, and in ceph
> osd tree (after restart ceph-mon) I saw:
> ~]# ceph osd tree
> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -7 16.89590 root ssd-disks
> -11 0 host ssd1
> -231707408 0
> 22100 0 osd.22100 DNE 0
> 71 0 osd.71 DNE 0
> when I tried to delete osd.22100:
> [root@cc1 ~]# ceph osd crush remove osd.22100
> device 'osd.22100' does not appear in the crush map
> then I tried to delete osd.71 and mon proccess crushed:
> [root@cc1 ~]# ceph osd crush remove osd.71
> 2016-12-28 17:52:34.459668 7f426a862700 0 monclient: hunting for new mon
> after restart of ceph-mon in ceph osd tree it shows:
> # ceph osd tree
> ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY
> -7 16.89590 root ssd-disks
> -11 0 host ssd1
> 598798032 0 osd.598798032 DNE 0
> 21940 0 osd.21940 DNE 0
> 71 0 osd.71 DNE 0
> My question is how to delete this osds without direct editing crushmap
> ? It is production system, I can't affort any service interruption :(,
> when I try to ceph osd crush remove then ceph-mon crushes....
> I dumped crushmap, but it took 19G (!!) after decompiling (compiled
> file is very small). So, I cleaned this file with perl (it take very
> long time), and I have now small txt crushmap, which I edited. But is
> there any chance that ceph will still remember somewhere about this
> huge numbers for osds ? Is it safe to apply this cleaned crushmap to
> cluster ? Cluster now works OK, but there is over 23TB production data
> which I can't loose. Please advice what to do.
> --
> Regards
> Luk
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph. com
Pozdrowienia,
Łukasz Chrustek
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com