On 15/03/2021 11:29, Matthew Vernon wrote:
On 15/03/2021 11:09, Dan van der Ster wrote:
Occasionally we see a bus glitch which causes a device to disappear
then reappear with a new /dev/sd name. This crashes the osd (giving IO
errors) but after a reboot the OSD will be perfectly fine.
We're looking for a way to reeactivate osd like this without rebooting.
Systemd's udev plumbing is _meant_ to cope with this OK (infuriatingly
the only place it seems to do so reliably is our test cluster!), but it
doesn't seem very good at it.
Sorry, I realise showing what that looks like when it works might be
helpful.
Pulling a drive (/dev/sdan):
Oct 1 15:55:49 sto-t1-3 systemd[1]: Stopping LVM2 PV scan on device
66:112...
Oct 1 15:55:49 sto-t1-3 lvm[932541]: Device 66:112 not found. Cleared
from lv
metad cache.
Oct 1 15:55:49 sto-t1-3 systemd[1]: Stopped LVM2 PV scan on device 66:112.
then after the drive comes back (as /dev/sdbk):
Oct 1 15:57:04 sto-t1-3 systemd[1]: Starting LVM2 PV scan on device
67:224...
Oct 1 15:57:04 sto-t1-3 lvm[932557]: 1 logical volume(s) in volume
group "ceph-5077d6e1-460b-43ca-8845-5cbae468c1a8" now active
Oct 1 15:57:04 sto-t1-3 systemd[1]: Started LVM2 PV scan on device 67:224.
Regards,
Matthew
--
The Wellcome Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx