On 5/17/2021 3:01 PM, Stuart Hayes wrote:
On 5/5/2021 11:12 AM, Keith Busch wrote:
On Fri, Apr 16, 2021 at 03:20:10PM -0400, Stuart Hayes wrote:
This patch adds support for the PCIe SSD Status LED Management
interface, as described in the "_DSM Additions for PCIe SSD Status LED
Management" ECN to the PCI Firmware Specification revision 3.2.
It will add a single (led_classdev) LED for any PCIe device that has the
relevant _DSM. The ten possible status states are exposed using
attributes current_states and supported_states. Reading current_states
(and supported_states) will show the definition and value of each bit:
There is significant overlap in this ECN with the PCIe native enclosure
management (NPEM) capability. Would it be possible for the sysfs
interface to provide an abstraction such that both these implementations
could subscribe to?
It wouldn't be too hard to check for the NPEM capability here and
provide access to it as well (or it could be added on later without too
much trouble), but it looks like NPEM support is already implemented in
user space (in ledmon). (I only wrote a kernel module for this, because
it uses a _DSM which can't readily be accessed in user space.)
I've reworked the driver a bit, so it would be very easy to separate it
into two parts:
* a generic drive status LED driver/module (in drivers/leds??) that
would handle creating the struct led_classdev and sysfs interface to it
* a PCIe drive status LED driver/module that checks PCI devices for
_DSM or NPEM support, and calls the generic drive status LED
driver/module to create an LED when it finds a device that supports it.
My code only supports _DSM, as I don't have any hardware to test NPEM
on, but it would be very simple to add NPEM support (much more so than
in the patch I sent in this thread).
I'm not sure if it is worth splitting the code into two pieces,
though... I don't know if anything other than NPEM or the _DSM would
bother to use this, and I think those make more sense in a single module
because they are so similar (the _DSM method was created as a way to do
the same thing as NPEM but without hardware support, and they need the
same code for scanning and detecting added/removed devices).
Maybe I should just leave this all in one file for now, and it could be
split later if anyone wanted to add support for SES or some other method
in the future?
cat /sys/class/leds/0000:88:00.0::pcie_ssd_status/supported_states
ok 0x0004 [ ]
locate 0x0008 [*]
fail 0x0010 [ ]
rebuild 0x0020 [ ]
pfa 0x0040 [ ]
hotspare 0x0080 [ ]
criticalarray 0x0100 [ ]
failedarray 0x0200 [ ]
invaliddevice 0x0400 [ ]
disabled 0x0800 [ ]
--
supported_states = 0x0008
This is quite verbose for a sysfs property. The common trend for new
properties is that they're consumed by programs as well as humans, so
just ouputing a raw number should be sufficient if the values have a
well defined meaning.
I was able to rework this so it uses a scheduler type of output and
eliminate the numbers, since the PCI specs are not public (and methods
other than NPEM/_DSM might use different numbers). It only needs a
single attribute "states", which shows all supported states, and has
brackets on the ones that are active.. for example:
[ok] [locate] fail rebuid ica ifa
To set multiple states, just echo the states separated by spaces or commas:
echo "locate ica" > states
(I renamed criticalarray and failedarray to ica and ifa to match
ledmon's state names).
Does this seem better?
Thanks!