While converting a luminous cluster from filestore to bluestore, we are running into a weird race condition on a fairly regular basis. We have a master script that writes upgrade scripts for each OSD server. The script for an OSD looks like this: ceph osd out 68 while ! ceph osd safe-to-destroy 68 ; do sleep 10 ; done systemctl stop ceph-osd@68 sleep 10 systemctl kill ceph-osd@68 sleep 10 umount /var/lib/ceph/osd/ceph-68 ceph osd destroy 68 --yes-i-really-mean-it ceph-volume lvm zap /dev/sda --destroy ceph-volume lvm create --bluestore --data /dev/sda --osd-id 68 sleep 10 while [ "`ceph health`" != "HEALTH_OK" ] ; do ceph health; sleep 10 ; done (It's run with sh -e so any error will cause an abort.) The problem we run into is that in about 1 out of 10 runs, when this gets to the "lvm zap" stage, and fails: --> Zapping: /dev/sda Running command: wipefs --all /dev/sda2 Running command: dd if=/dev/zero of=/dev/sda2 bs=1M count=10 stderr: 10+0 records in 10+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.00667608 s, 1.6 GB/s --> Destroying partition since --destroy was used: /dev/sda2 Running command: parted /dev/sda --script -- rm 2 --> Unmounting /dev/sda1 Running command: umount -v /dev/sda1 stderr: umount: /var/lib/ceph/tmp/mnt.9k0GDx (/dev/sda1) unmounted Running command: wipefs --all /dev/sda1 stderr: wipefs: error: /dev/sda1: probing initialization failed: stderr: Device or resource busy --> RuntimeError: command returned non-zero exit status: 1 And, lo and behold, it's right: /dev/sda1 has been remounted as /var/lib/ceph/osd/ceph-68. That's after the OSD has been stopped, killed, and destroyed; there *is no* osd.68. It happens after the filesystem has been unmounted twice (once by an explicit umount and once by "lvm zap." The "lvm zap" umount shown here with the path /var/lib/ceph/tmp/mnt.9k0GDx suggests that the remount is happening in the background somewhere while the lvm zap is running. If we do the zap before the osd destroy, the same thing happens but the (still-existing) OSD does not actually restart. So it's just the filesystem that won't stay unmounted long enough to destroy it, not the whole OSD. What's causing this? How do we keep the filesystem from lurching out of the grave in mid-conversion like this? This is on Debian Stretch with systemd, if that matters. Thanks! _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com