On Tue, Oct 23, 2018 at 11:08:57PM +0000, Sage Weil wrote:
I gave the latest lsmcli (libstoragemgmt) another try and it can blink the
HDD lights on my generic 2u supermicro boxes! It was a bit of a hassle
because ubuntu has an ancient version packaged, but once I built from
source it can do 'ident' (blinky red light) or 'fault' (solid red light).
Pretty simple! And now is the time to harass the ubuntu/debian folks to
get this into the next round of releases so we can take advantage of it
(Fedora/RHEL/CentOS should already have a good version.)
With the new device tracking that's coming in nautilus, I think we have
most of the pieces to surface useful ceph controls to turn lights on and
off. For example,
$ ceph device ls
DEVICE HOST:DEV DAEMONS LIFE EXPECTANCY
Crucial_CT1024M550SSD1_14160C164100 stud:sdd osd.40 >6w
Crucial_CT1024M550SSD1_14210C25B79E eutow:sds osd.19 >6w
So we could add
$ ceph device ident-on Crucial_CT1024M550SSD1_14160C164100
$ ceph device fault-on Crucial_CT1024M550SSD1_14210C25B79E
...
$ ceph device ident-off Crucial_CT1024M550SSD1_14160C164100
$ ceph device fault-off Crucial_CT1024M550SSD1_14210C25B79E
or perhaps
$ ceph osd ident-on osd.123
$ ceph osd fault-on osd.124
I'd prefer this. Maybe by default only the data device, with a flag to
optionally blink the shared journal/db device?
(although not that osds maybe backed by multiple devices, and you probably
don't want to pull the shared db/journal device in most cases).
My current thinking is that which lights should be on is persistently
stored by Ceph, and raises a HEALTH_WARN (or HEALTH_INFO, nudge nudge)
alert so that the operator knows that the light(s) are (still) on.
How to run nmcli
----------------
We can pretty trivially invoke 'lsmcli local-disk-fault-led-off --path
whatever' (or do something more minimal using the python bindings). The
gotcha is that we have to have something running on that host in order to
do it.
So, it would be pretty easy for an osd to ident its device(s) when it is
up, but if it's not up, then... not so much.
A few options:
1) Only do the ident/fault from a running OSD. This is pretty limiting,
and also runs the danger of not being able to turn the light off (if the
OSD then goes down).
2) Trigger the lights from any OSD (or possibly other daemon) that happens
to be running on the same host. This probably covers most cases, but..
it's still a bit limited. What if no OSDs are up? What if there is only
one OSD on the host and it is down?
3) Delegate this to the new orchestrator. Kube can just run this command
wherever we want. Ansible presumably can too.
Imho this is the way to go. DeepSea was actually about to start working on this,
so great timing :)
One other detail: while I'm sure libstorage is getting better with time, I'm
equally sure there will always be hardware that does not play along. We were
going to make the actual command configurable so user can drop in whatever they
need for this. Going the operator route, this might not be ceph's concern
anymore, just thought I'd mention it.
4) Depend on the libstoragemgmt network service. nmcli is just one part
of the suite... there's also a REST API that lets you do stuff. There are
presumably certificates to configure and such to make it all work, though.
Also, there are some implementation oddities. The on/off state source
of truth is the enclosure itself. So if you turn the light off in ceph,
we need to be certain we turned it off with the device before we clear out
our state. Maybe we have states like off, pending-on, on, pending-off,
and we don't transition from pending-foo to foo until we get a success
from the command that is supposed to toggle the light state.
Thoughts? I think this is within striking distance (finally) and it would
be sweet to land it in nautilus!
sage
--
Jan Fajerski
Engineer Enterprise Storage
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton,
HRB 21284 (AG Nürnberg)