Thanks, also for finding the related tracker issue! It looks like a fix has already been approved. Hope it shows up in the next release. Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Eugen Block <eblock@xxxxxx> Sent: 28 November 2022 10:58:31 To: Frank Schilder Cc: ceph-users@xxxxxxx Subject: Re: Re: ceph-volume lvm zap destroyes up+in OSD Hi, seems like this tracker issue [1] already covers your question. I'll update the issue and add a link to our thread. [1] https://tracker.ceph.com/issues/57767 Zitat von Frank Schilder <frans@xxxxxx>: > Hi Eugen, > > can you confirm that the silent corruption happens also on a > collocated OSDc (everything on the same device) on pacific? The zap > command should simply exit with "osd not down+out" or at least not > do anything. > > If this accidentally destructive behaviour is still present, I think > it is worth a ticket. Since I can't test on versions higher than > octopus yet, could you then open the ticket? > > Thanks! > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Eugen Block <eblock@xxxxxx> > Sent: 23 November 2022 09:27:22 > To: ceph-users@xxxxxxx > Subject: Re: ceph-volume lvm zap destroyes up+in OSD > > Hi, > > I can confirm the behavior for Pacific version 16.2.7. I checked with > a Nautilus test cluster and there it seems to work as expected. I > tried to zap a db device and then restarted one of the OSDs, > successfully. So there seems to be a regression somewhere. I didn't > search for tracker issues yet, but this seems to be worth one, right? > > Zitat von Frank Schilder <frans@xxxxxx>: > >> Hi all, >> >> on our octopus-latest cluster I accidentally destroyed an up+in OSD >> with the command line >> >> ceph-volume lvm zap /dev/DEV >> >> It executed the dd command and then failed at the lvm commands with >> "device busy". Problem number one is, that the OSD continued working >> fine. Hence, there is no indication of a corruption, its a silent >> corruption. Problem number two - the real one - is, why is >> ceph-colume not checking if the OSD that device belongs to is still >> up+in? "ceph osd destroy" does that, for example. I believe to >> remember that "ceph-volume lvm zap --osd-id" also checks, but I'm >> not sure. >> >> Has this been changed in versions later than octopus? >> >> I think it is extremely dangerous to provide a tool that allows the >> silent corruption of an entire ceph cluster. The corruption is only >> discovered on restart and then it would be too late (unless there is >> an in-official recovery procedure somewhere). >> >> I would prefer that ceph-volume lvm zap employs the same strict >> sanity checks as other ceph-commands to avoid accidents. In my case >> it was a typo, one wrong letter. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> _______________________________________________ >> ceph-users mailing list -- ceph-users@xxxxxxx >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@xxxxxxx > To unsubscribe send an email to ceph-users-leave@xxxxxxx _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx