Re: ceph-osd@ service keeps restarting after removing osd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Thu, May 31, 2018 at 4:40 PM Gregory Farnum <gfarnum@xxxxxxxxxx> wrote:
On Thu, May 24, 2018 at 9:15 AM Michael Burk <michael.burk@xxxxxxxxxxx> wrote:
Hello,

I'm trying to replace my OSDs with higher capacity drives. I went through the steps to remove the OSD on the OSD node:
# ceph osd out osd.2
# ceph osd down osd.2
# ceph osd rm osd.2
Error EBUSY: osd.2 is still up; must be down before removal.
# systemctl stop ceph-osd@2
# ceph osd rm osd.2
removed osd.2
# ceph osd crush rm osd.2
removed item id 2 name 'osd.2' from crush map
# ceph auth del osd.2
updated

umount /var/lib/ceph/osd/ceph-2

It no longer shows in the crush map, and I am ready to remove the drive. However, the ceph-osd@ service keeps restarting and mounting the disk in /var/lib/ceph/osd. I do "systemctl stop ceph-osd@2" and umount the disk, but then the service starts again and mounts the drive.

# systemctl stop ceph-osd@2
# umount /var/lib/ceph/osd/ceph-2

/dev/sdb1 on /var/lib/ceph/osd/ceph-2 type xfs (rw,noatime,seclabel,attr2,inode64,logbsize=256k,sunit=512,swidth=512,noquota)

ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous (stable)

What am I missing?

Obviously this is undesired!
In general, when using ceph-disk (as you presumably are) the OSD is designed to turn on automatically when a formatted disk gets mounted. I'd imagine that something (quite possibly included with ceph) is auto-mounting the disk after you umount this. We have a ceph-disk@.service which is supposed to get fired once, but perhaps there's something else I'm missing so that udev fires an event, it gets captured by one of the ceph tools that sees there's an available drive tagged for Ceph, and then it auto-mounts? I'm not sure why this would be happening for you and not others, though.
​I'm guessing it's because I'm replacing a batch of disks at once. The time between stopping ceph-osd@ and seeing it start again is at least several seconds, so if you just do one disk and remove it right away you probably wouldn't have this problem. But since I do several at a time and watch "ceph -w" after each one, it can be several minutes before I get to the point of removing the volumes from the array controller.

All this changes with ceph-volume, which will be the default in Mimic, by the way.

Hmm, just poking
​​
at things a little more, I think maybe you wanted to put a "ceph-disk deactivate" invocation in there. Try playing around with that?
​Ahh, good catch. I will test this. Thank you!

-Greg

 

Thanks,
Michael
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux