Re: problem with removing osd

Sage Weil <sage@xxxxxxxxxxxx> · Thu, 29 Dec 2016 20:49:19 +0000 (UTC)

On Thu, 29 Dec 2016, Łukasz Chrustek wrote:
> Cześć,
> 
> > On Thu, 29 Dec 2016, Łukasz Chrustek wrote:
> >> Hi,
> >> 
> >> 
> >> >> 
> >> >> # ceph osd tree
> >> >> ID        WEIGHT    TYPE NAME             UP/DOWN REWEIGHT PRIMARY-AFFINITY
> >> >>        -7  16.89590 root ssd-disks
> >> >>       -11         0     host ssd1
> >> >> 598798032         0         osd.598798032     DNE        0
> >> 
> >> > Yikes!
> >> 
> >> Yes... indeed, I don't like this number too...
> >> 
> >> >>     21940         0         osd.21940         DNE        0
> >> >>        71         0         osd.71            DNE        0
> >> >> 
> >> >> My question is how to delete this osds without direct editing crushmap
> >> >> ? It is production system, I can't affort any service interruption :(,
> >> >> when I try to ceph osd crush remove then ceph-mon crushes....
> >> >> 
> >> >> I  dumped  crushmap,  but it took 19G (!!) after decompiling (compiled
> >> >> file  is  very small). So, I cleaned this file with perl (it take very
> >> >> long  time), and I have now small txt crushmap, which I edited. But is
> >> >> there  any  chance  that ceph will still remember somewhere about this
> >> >> huge  numbers  for osds ? Is it safe to apply this cleaned crushmap to
> >> >> cluster ?
> >> 
> >> > It sounds like the problem is the OSDMap, not CRUSH per se.  Can you 
> >> > attach the output from 'ceph osd dump -f json-pretty'?
> >> 
> >> It's quite big so I put it on pastebin:
> >> 
> >> http://pastebin.com/Unkk2Pa7
> >> 
> >> > Do you know how osd.598798032 got created?  Or osd.21940 for that matter.
> >> > OSD ids should be small since they are stored internally by OSDMap as a
> >> > vector.  This is probably why your mon is crashing.
> >> 
> >> [root@cc1 /etc/ceph]# ceph osd tree
> >> ID  WEIGHT    TYPE NAME            UP/DOWN REWEIGHT PRIMARY-AFFINITY
> >>  -7  16.89590 root ssd-intel-s3700
> >> -11         0     host ssd-stor1
> >>  69         0         osd.69          down        0          1.00000
> >>  70         0         osd.70          down        0          1.00000
> >>  71         0         osd.71          down        0          1.00000
> >> 
> >> 
> >> This the moment, when it happend:
> >> ]# for i in `seq 69 71`;do ceph osd crush remove osd.$i;done
> >> removed item id 69 name 'osd.69' from crush map
> >> 
> >> 
> >> removed item id 70 name 'osd.70' from crush map
> >> 
> >> here i press ctrl+c
> >> 
> >> 2016-12-28 17:38:10.055239 7f4576d7a700  0 monclient: hunting for new mon
> >> 2016-12-28 17:38:10.055582 7f4574233700  0 -- 192.168.128.1:0/1201679761 >> 192.168.128.2:6789/0 pipe(0x7f456c023190 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f456c024470).fault
> >> 2016-12-28 17:38:30.550622 7f4574233700  0 -- 192.168.128.1:0/1201679761 >> 192.168.128.1:6789/0 pipe(0x7f45600008c0 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7f4560001df0).fault
> >> 2016-12-28 17:38:54.551031 7f4574474700  0 -- 192.168.128.1:0/1201679761 >> 192.168.128.2:6789/0 pipe(0x7f45600046c0 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7f45600042b0).fault
> 
> > What version is this?
> 
> infernalis
> 
> > Can you attach the crush map too?  (ceph osd crush dump -f json-pretty)
> 
> I can't - ceph-mons are crushing on diffrent ceph-mon hosts:
> 
> ]# ceph osd crush dump -f json-pretty

Hmm, in that case, 'ceph osd getcrushmap -o cm' and post that somewhere?

sage