Re: osd removal problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

As  I  wrote at first mail - I have already done that, but I'm affraid
to load it back - is there any chance, that something will go wrong ?

Thanks for answer !

> Hi,


> Hmm, could you try and dump the crush map - decompile it - modify
> it to remove the DNE osd's, compile it and load it back into ceph?


> http://docs.ceph.com/docs/master/rados/operations/crush-map/#get-a-crush-map



> Thanks

> On Thu, Dec 29, 2016 at 1:01 PM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:

> Hi,

>  ]# ceph osd tree
> ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
>         -7  16.89590 root ssd-disks
>        -11         0     host ssd1
>  598798032         0         osd.598798032     DNE        0
>      21940         0         osd.21940         DNE        0
>         71         0         osd.71            DNE        0

> ]# ceph osd rm osd.598798032
>  Error EINVAL: osd id 598798032 is too largeinvalid osd id-34
>  ]# ceph osd rm osd.21940
>  osd.21940 does not exist.
>  ]# ceph osd rm osd.71
>  osd.71 does not exist.


 >> ceph osd rm osd.$ID

 >> On Thu, Dec 29, 2016 at 10:44 AM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:

 >> Hi,

 >>  I was trying to delete 3 osds from cluster, deletion procces took very
 >>  long  time and I interrupted it. mon process then crushed, and in ceph
 >>  osd tree (after restart ceph-mon) I saw:

 >>   ~]# ceph osd tree
 >>  ID         WEIGHT    TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
 >>          -7  16.89590 root ssd-disks
 >>         -11         0     host ssd1
 >>  -231707408         0
 >>       22100         0         osd.22100        DNE        0
 >>          71         0         osd.71           DNE        0


 >>  when I tried to delete osd.22100:

 >>  [root@cc1 ~]# ceph osd crush remove osd.22100
 >>  device 'osd.22100' does not appear in the crush map

 >>  then I tried to delete osd.71 and mon proccess crushed:

 >>  [root@cc1 ~]# ceph osd crush remove osd.71
 >>  2016-12-28 17:52:34.459668 7f426a862700  0 monclient: hunting for new mon

 >>  after restart of ceph-mon in ceph osd tree it shows:

 >>  # ceph osd tree
 >>  ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
 >>         -7  16.89590 root ssd-disks
 >>        -11         0     host ssd1
 >>  598798032         0         osd.598798032     DNE        0
 >>      21940         0         osd.21940         DNE        0
 >>         71         0         osd.71            DNE        0

 >>  My question is how to delete this osds without direct editing crushmap
 >>  ? It is production system, I can't affort any service interruption :(,
 >>  when I try to ceph osd crush remove then ceph-mon crushes....

 >>  I  dumped  crushmap,  but it took 19G (!!) after decompiling (compiled
 >>  file  is  very small). So, I cleaned this file with perl (it take very
 >>  long  time), and I have now small txt crushmap, which I edited. But is
 >>  there  any  chance  that ceph will still remember somewhere about this
 >>  huge  numbers  for osds ? Is it safe to apply this cleaned crushmap to
 >>  cluster ? Cluster now works OK, but there is over 23TB production data
 >>  which I can't loose. Please advice what to do.


 >>  --
 >>  Regards
 >>  Luk

 >>  _______________________________________________
 >>  ceph-users mailing list
 >> ceph-users@xxxxxxxxxxxxxx
 >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





> --
>  Pozdrowienia,
>   Łukasz Chrustek






-- 
Pozdrowienia,
 Łukasz Chrustek

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux