On Mon, 8 Jul 2024 14:33:34 +0300 (EEST) Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote: > On Fri, 5 Jul 2024, Mariusz Tkaczyk wrote: > > > Native PCIe Enclosure Management (NPEM, PCIe r6.1 sec 6.28) allows > > managing LED in storage enclosures. NPEM is indication oriented > > and it does not give direct access to LED. Although each of > > the indications *could* represent an individual LED, multiple > > indications could also be represented as a single, > > multi-color LED or a single LED blinking in a specific interval. > > The specification leaves that open. > > > > Each enabled indication (capability register bit on) is represented as a > > ledclass_dev which can be controlled through sysfs. For every ledclass > > device only 2 brightness states are allowed: LED_ON (1) or LED_OFF (0). > > It is corresponding to NPEM control register (Indication bit on/off). > > > > Ledclass devices appear in sysfs as child devices (subdirectory) of PCI > > device which has an NPEM Extended Capability and indication is enabled > > in NPEM capability register. For example, these are leds created for > > pcieport "10000:02:05.0" on my setup: > > > > leds/ > > ├── 10000:02:05.0:enclosure:fail > > ├── 10000:02:05.0:enclosure:locate > > ├── 10000:02:05.0:enclosure:ok > > └── 10000:02:05.0:enclosure:rebuild > > > > They can be also found in "/sys/class/leds" directory. Parent PCIe device > > bdf is used to guarantee uniqueness across leds subsystem. > > > > To enable/disable fail indication "brightness" file can be edited: > > echo 1 > ./leds/10000:02:05.0:enclosure:fail/brightness > > echo 0 > ./leds/10000:02:05.0:enclosure:fail/brightness > > > > PCIe r6.1, sec 7.9.19.2 defines the possible indications. > > > > Multiple indications for same parent PCIe device can conflict and > > hardware may update them when processing new request. To avoid issues, > > driver refresh all indications by reading back control register. > > > > Driver is projected to be exclusive NPEM extended capability manager. > > It waits up to 1 second after imposing new request, it doesn't verify if > > controller is busy before write, assuming that mutex lock gives protection > > from concurrent updates. Driver is not registered if _DSM LED management > > is available. > > > > NPEM is a PCIe extended capability so it should be registered in > > pcie_init_capabilities() but it is not possible due to LED dependency. > > Parent pci_device must be added earlier for led_classdev_register() > > to be successful. NPEM does not require configuration on kernel side, it > > is safe to register LED devices later. > > > > Link: https://members.pcisig.com/wg/PCI-SIG/document/19849 [1] > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> > > Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx> > > Looks to be in quite good shape already, one comment below I think should > be addressed before this is ready to go. > > > +static int npem_set_active_indications(struct npem *npem, u32 inds) > > +{ > > + int ctrl, ret, ret_val; > > + u32 cc_status; > > + > > + lockdep_assert_held(&npem->lock); > > + > > + /* This bit is always required */ > > + ctrl = inds | PCI_NPEM_CTRL_ENABLE; > > + > > + ret = npem_write_ctrl(npem, ctrl); > > + if (ret) > > + return ret; > > + > > + /* > > + * For the case where a NPEM command has not completed immediately, > > + * it is recommended that software not continuously “spin” on > > polling > > + * the status register, but rather poll under interrupt at a > > reduced > > + * rate; for example at 10 ms intervals. > > + * > > + * PCIe r6.1 sec 6.28 "Implementation Note: Software Polling of > > NPEM > > + * Command Completed" > > + */ > > + ret = read_poll_timeout(npem_read_reg, ret_val, > > + ret_val || (cc_status & > > PCI_NPEM_STATUS_CC), > > + 10 * USEC_PER_MSEC, USEC_PER_SEC, false, > > npem, > > + PCI_NPEM_STATUS, &cc_status); > > + if (ret) > > + return ret_val; > > Will this work as intended? > > If ret_val gets set, cond in read_poll_timeout() is true and it returns 0 > so the return branch is not taken. > > Also, when read_poll_timeout() times out, ret_val might not be non-zero. Yes, it is good catch thanks! What about? if (ret) return ret; if (ret_val) return ret_val; If ret is set it means that it times out- we should return that to caller. If ret_val is set it means that we received error in npem_read_reg()- we should return that (device probably is unreachable). If read_val is set then we are less interested in ret because error from npem_read_reg() function is more critical, so it is "acceptable" to have ret = 0 in this case. Mariusz