Re: osd removal problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Hmm, could you try and dump the crush map - decompile it - modify it to remove the DNE osd's, compile it and load it back into ceph?

http://docs.ceph.com/docs/master/rados/operations/crush-map/#get-a-crush-map

Thanks

On Thu, Dec 29, 2016 at 1:01 PM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:
Hi,

]# ceph osd tree
ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
       -7  16.89590 root ssd-disks
      -11         0     host ssd1
598798032         0         osd.598798032     DNE        0
    21940         0         osd.21940         DNE        0
       71         0         osd.71            DNE        0

]# ceph osd rm osd.598798032
Error EINVAL: osd id 598798032 is too largeinvalid osd id-34
]# ceph osd rm osd.21940
osd.21940 does not exist.
]# ceph osd rm osd.71
osd.71 does not exist.

> ceph osd rm osd.$ID

> On Thu, Dec 29, 2016 at 10:44 AM, Łukasz Chrustek <skidoo@xxxxxxx> wrote:

> Hi,

>  I was trying to delete 3 osds from cluster, deletion procces took very
>  long  time and I interrupted it. mon process then crushed, and in ceph
>  osd tree (after restart ceph-mon) I saw:

>   ~]# ceph osd tree
>  ID         WEIGHT    TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
>          -7  16.89590 root ssd-disks
>         -11         0     host ssd1
>  -231707408         0
>       22100         0         osd.22100        DNE        0
>          71         0         osd.71           DNE        0


>  when I tried to delete osd.22100:

>  [root@cc1 ~]# ceph osd crush remove osd.22100
>  device 'osd.22100' does not appear in the crush map

>  then I tried to delete osd.71 and mon proccess crushed:

>  [root@cc1 ~]# ceph osd crush remove osd.71
>  2016-12-28 17:52:34.459668 7f426a862700  0 monclient: hunting for new mon

>  after restart of ceph-mon in ceph osd tree it shows:

>  # ceph osd tree
>  ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
>         -7  16.89590 root ssd-disks
>        -11         0     host ssd1
>  598798032         0         osd.598798032     DNE        0
>      21940         0         osd.21940         DNE        0
>         71         0         osd.71            DNE        0

>  My question is how to delete this osds without direct editing crushmap
>  ? It is production system, I can't affort any service interruption :(,
>  when I try to ceph osd crush remove then ceph-mon crushes....

>  I  dumped  crushmap,  but it took 19G (!!) after decompiling (compiled
>  file  is  very small). So, I cleaned this file with perl (it take very
>  long  time), and I have now small txt crushmap, which I edited. But is
>  there  any  chance  that ceph will still remember somewhere about this
>  huge  numbers  for osds ? Is it safe to apply this cleaned crushmap to
>  cluster ? Cluster now works OK, but there is over 23TB production data
>  which I can't loose. Please advice what to do.


>  --
>  Regards
>  Luk

>  _______________________________________________
>  ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--
Pozdrowienia,
 Łukasz Chrustek


_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux