(I've done some light blinking management in the past, and it is,
indeed, complicated)
On 3/5/19 2:39 PM, Sebastian Wagner wrote:
As far as I can see, nothing is going to work.
The state stored in Ceph will probably not match with the reality:
* Users may reboot machines without telling Ceph.
If Ceph has a daemon running on that machine, we should be able to
detect the reboot, yes?
* Uses will start from scratch with a new ceph cluster
This is a real issue, more below.
* Users will want to enable lights on disks not yet known to Ceph
I don't think this is an issue; see below.
* Issuing on/off commands will fail.
There's, unfortunately, not much we can do about this.
My proposal, based on past experience, is that we declare a set of disks
we "own" (presumably the set with OSDs on them), and actively manage the
light state of those disks. We keep a list of the light state for each
such disk, and whenever that list changes, we go through and actively
set the light state of all the disks we own to either on or off. The
light state in general isn't important, but only when a user actually
wants to see a disk lit (or unlit), so I think we can get away with
making sure the lights are correct when they change, possibly with a
button that resets all lights to the correct state, in case someone
changed lights behind our back.
Daniel
Especially as there is a high probability that devices are broken on day
one.
I don't see any place (neither in Ceph nor in the orchestrator) where we
can store the state of LEDs reliable. I'd suggest that we take the list
of enabled lights simply as a rough advice.
Sebastian
Am 28.02.19 um 18:07 schrieb Brett Niver:
That seems different than reading the state of an LED, but rather
tracking LEDs have been turned on or not. I.e. internal state -
doesn't have to match actual diode state, just need to be controlled
centrally - one point of truth. Actually physically reading an LED
isn't actually always reliable.
On Thu, Feb 28, 2019 at 11:59 AM Sage Weil <sweil@xxxxxxxxxx> wrote:
On Thu, 28 Feb 2019, Brett Niver wrote:
Why do we care about state? At some level the code has reasons to
want the LED to be either on or off...
Mostly we don't need to care. I can think of a couple problem
scenarios, though:
- Someone out of band turns a light on. Then ceph turns on another light,
a human sees the first light, a pulls the wrong drive.
- What if the host is down, but you want the health warning to go away?
There needs to be some 'force' option that will proceed to forget the
light was ever on when we can't reach the host, but that relies on a human
operator promising that the host really is off and thus the light won't
come back on.
- We have some bug/race in our code that means we fail to turn off the
light before removing our notion that the light is on. Maybe an aborted
attempt to turn the light on has some slow request wandering through the
orchestrator queue of stuff to do and finally executes sometime after we
tell the system to turn the light back off?
sage
On Thu, Feb 28, 2019 at 9:11 AM Sage Weil <sweil@xxxxxxxxxx> wrote:
On Thu, 28 Feb 2019, Tim Serong wrote:
On 02/28/2019 09:50 AM, Travis Nielsen wrote:
On Wed, Feb 27, 2019 at 3:42 PM Sage Weil <sweil@xxxxxxxxxx> wrote:
On Wed, 27 Feb 2019, Travis Nielsen wrote:
Some questions and comments:
- What is the user interaction? Is he specifying an OSD ID for which
he wants to blink the light or what is $PATH? If $PATH is a device
name such as /dev/sdb we would need to translate the OSD ID to the
device.
Right now the module implements
ceph device {ident,fault}-light-{on,off} <devid>
although once this is all working we can also add commands that operate on
osd IDs.
Presumably the OSD commands will just be implemented directly inside
ceph-mgr (which can get OSD metadata to map IDs back to the relevant
hostnames and device paths)? Or is there anything special an individual
orchesetrator might need to do for this case?
Right, it'll just be a slightly more complicated command in the blinky
module (or wherever we move this code to later).
- This feels like a "desired state" way of doing things since you want
a light on until you decide to turn it off. In this case, we could
create a CRD for desired state of device lights. CRDs are the way the
rook module should interact with the rook operator.
- Whenever the CRD changes, rook would update the lights. When
rook starts, it would also ensure the lights are set appropriately.
- If a CRD is created it could mean the light should turn on for
that device. If the CRD is deleted, the light should turn off. If
there were different blinking modes, there could be a setting in the
CRD to indicate such.
That works. I was just thinking that since the mgr is already maintaining
this set of desired-on lights we could keep the rook side of it simple.
Ah i missed that the mgr already stored this state. So if we can't
detect the actual state of the lights, this means the mgr is only
keeping track of the desire to turn the light on or off? And this
would translate to a health warning if a light should be on.
- What does it take to detect the current state of the lights? Do we
run lsmcli on each node? If so, the discovery daemonset would make
sense to do this.
If rook took the additional step of detecting lights that are on (due to
external actors) that would make the whole thing a bit more robust, and be
a good reason to bother with teh complexity of a CRD. I don't see
anything to get current status from the version I have on fedora 29,
though.
If we didn't use a CRD, the rook module could store the settings in a
configmap, then run a k8s job itself to turn the lights on or off.
However, I'd say the CRDs are the more natural approach.
If we can't detect the current state with current tools, I wonder if just
having the mgr module schedule a one-off command to run lsmcli is
simpler... does having rook store the state in a configmap or crd buy us
anything?
Right, if we can't detect the current state of the lights, rook can't
really manage the desired state and may not make sense for rook to get
involved here. The mgr module could easily run a k8s job directly to
turn the light on or off and we wouldn't worry about managing desired
state.
I'd suggest the same is true for other ochestrators
(ansible/deepsea/ssh). If we can't detect the state, we shouldn't do
anything at the individual orchestrator level. (If we could detect
state, we'd just want to pass it up to ceph-mgr, rather than having each
individual module implement its own record of LED state)
Right.
sage