On Mon, 8 Jul 2024, Mariusz Tkaczyk wrote: > On Mon, 8 Jul 2024 14:33:34 +0300 (EEST) > Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote: > > > On Fri, 5 Jul 2024, Mariusz Tkaczyk wrote: > > > > > Native PCIe Enclosure Management (NPEM, PCIe r6.1 sec 6.28) allows > > > managing LED in storage enclosures. NPEM is indication oriented > > > and it does not give direct access to LED. Although each of > > > the indications *could* represent an individual LED, multiple > > > indications could also be represented as a single, > > > multi-color LED or a single LED blinking in a specific interval. > > > The specification leaves that open. > > > > > > Each enabled indication (capability register bit on) is represented as a > > > ledclass_dev which can be controlled through sysfs. For every ledclass > > > device only 2 brightness states are allowed: LED_ON (1) or LED_OFF (0). > > > It is corresponding to NPEM control register (Indication bit on/off). > > > > > > Ledclass devices appear in sysfs as child devices (subdirectory) of PCI > > > device which has an NPEM Extended Capability and indication is enabled > > > in NPEM capability register. For example, these are leds created for > > > pcieport "10000:02:05.0" on my setup: > > > > > > leds/ > > > ├── 10000:02:05.0:enclosure:fail > > > ├── 10000:02:05.0:enclosure:locate > > > ├── 10000:02:05.0:enclosure:ok > > > └── 10000:02:05.0:enclosure:rebuild > > > > > > They can be also found in "/sys/class/leds" directory. Parent PCIe device > > > bdf is used to guarantee uniqueness across leds subsystem. > > > > > > To enable/disable fail indication "brightness" file can be edited: > > > echo 1 > ./leds/10000:02:05.0:enclosure:fail/brightness > > > echo 0 > ./leds/10000:02:05.0:enclosure:fail/brightness > > > > > > PCIe r6.1, sec 7.9.19.2 defines the possible indications. > > > > > > Multiple indications for same parent PCIe device can conflict and > > > hardware may update them when processing new request. To avoid issues, > > > driver refresh all indications by reading back control register. > > > > > > Driver is projected to be exclusive NPEM extended capability manager. > > > It waits up to 1 second after imposing new request, it doesn't verify if > > > controller is busy before write, assuming that mutex lock gives protection > > > from concurrent updates. Driver is not registered if _DSM LED management > > > is available. > > > > > > NPEM is a PCIe extended capability so it should be registered in > > > pcie_init_capabilities() but it is not possible due to LED dependency. > > > Parent pci_device must be added earlier for led_classdev_register() > > > to be successful. NPEM does not require configuration on kernel side, it > > > is safe to register LED devices later. > > > > > > Link: https://members.pcisig.com/wg/PCI-SIG/document/19849 [1] > > > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx> > > > Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx> > > > > Looks to be in quite good shape already, one comment below I think should > > be addressed before this is ready to go. > > > > > +static int npem_set_active_indications(struct npem *npem, u32 inds) > > > +{ > > > + int ctrl, ret, ret_val; > > > + u32 cc_status; > > > + > > > + lockdep_assert_held(&npem->lock); > > > + > > > + /* This bit is always required */ > > > + ctrl = inds | PCI_NPEM_CTRL_ENABLE; > > > + > > > + ret = npem_write_ctrl(npem, ctrl); > > > + if (ret) > > > + return ret; > > > + > > > + /* > > > + * For the case where a NPEM command has not completed immediately, > > > + * it is recommended that software not continuously “spin” on > > > polling > > > + * the status register, but rather poll under interrupt at a > > > reduced > > > + * rate; for example at 10 ms intervals. > > > + * > > > + * PCIe r6.1 sec 6.28 "Implementation Note: Software Polling of > > > NPEM > > > + * Command Completed" > > > + */ > > > + ret = read_poll_timeout(npem_read_reg, ret_val, > > > + ret_val || (cc_status & > > > PCI_NPEM_STATUS_CC), > > > + 10 * USEC_PER_MSEC, USEC_PER_SEC, false, > > > npem, > > > + PCI_NPEM_STATUS, &cc_status); > > > + if (ret) > > > + return ret_val; > > > > Will this work as intended? > > > > If ret_val gets set, cond in read_poll_timeout() is true and it returns 0 > > so the return branch is not taken. > > > > > Also, when read_poll_timeout() times out, ret_val might not be non-zero. > > Yes, it is good catch thanks! What about? > > if (ret) > return ret; > if (ret_val) > return ret_val; > > If ret is set it means that it times out- we should return that to caller. > > If ret_val is set it means that we received error in npem_read_reg()- we should > return that (device probably is unreachable). > > If read_val is set then we are less interested in ret because error from > npem_read_reg() function is more critical, so it is "acceptable" to have ret = 0 > in this case. Yes, I think that will do. -- i.