Re: [PATCH v3 2/3] PCI/NPEM: Add Native PCIe Enclosure Management support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 8 Jul 2024 14:33:34 +0300 (EEST)
Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx> wrote:

> On Fri, 5 Jul 2024, Mariusz Tkaczyk wrote:
> 
> > Native PCIe Enclosure Management (NPEM, PCIe r6.1 sec 6.28) allows
> > managing LED in storage enclosures. NPEM is indication oriented
> > and it does not give direct access to LED. Although each of
> > the indications *could* represent an individual LED, multiple
> > indications could also be represented as a single,
> > multi-color LED or a single LED blinking in a specific interval.
> > The specification leaves that open.
> > 
> > Each enabled indication (capability register bit on) is represented as a
> > ledclass_dev which can be controlled through sysfs. For every ledclass
> > device only 2 brightness states are allowed: LED_ON (1) or LED_OFF (0).
> > It is corresponding to NPEM control register (Indication bit on/off).
> > 
> > Ledclass devices appear in sysfs as child devices (subdirectory) of PCI
> > device which has an NPEM Extended Capability and indication is enabled
> > in NPEM capability register. For example, these are leds created for
> > pcieport "10000:02:05.0" on my setup:
> > 
> > leds/
> > ├── 10000:02:05.0:enclosure:fail
> > ├── 10000:02:05.0:enclosure:locate
> > ├── 10000:02:05.0:enclosure:ok
> > └── 10000:02:05.0:enclosure:rebuild
> > 
> > They can be also found in "/sys/class/leds" directory. Parent PCIe device
> > bdf is used to guarantee uniqueness across leds subsystem.
> > 
> > To enable/disable fail indication "brightness" file can be edited:
> > echo 1 > ./leds/10000:02:05.0:enclosure:fail/brightness
> > echo 0 > ./leds/10000:02:05.0:enclosure:fail/brightness
> > 
> > PCIe r6.1, sec 7.9.19.2 defines the possible indications.
> > 
> > Multiple indications for same parent PCIe device can conflict and
> > hardware may update them when processing new request. To avoid issues,
> > driver refresh all indications by reading back control register.
> > 
> > Driver is projected to be exclusive NPEM extended capability manager.
> > It waits up to 1 second after imposing new request, it doesn't verify if
> > controller is busy before write, assuming that mutex lock gives protection
> > from concurrent updates. Driver is not registered if _DSM LED management
> > is available.
> > 
> > NPEM is a PCIe extended capability so it should be registered in
> > pcie_init_capabilities() but it is not possible due to LED dependency.
> > Parent pci_device must be added earlier for led_classdev_register()
> > to be successful. NPEM does not require configuration on kernel side, it
> > is safe to register LED devices later.
> > 
> > Link: https://members.pcisig.com/wg/PCI-SIG/document/19849 [1]
> > Suggested-by: Lukas Wunner <lukas@xxxxxxxxx>
> > Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@xxxxxxxxxxxxxxx>  
> 
> Looks to be in quite good shape already, one comment below I think should 
> be addressed before this is ready to go.
> 
> > +static int npem_set_active_indications(struct npem *npem, u32 inds)
> > +{
> > +	int ctrl, ret, ret_val;
> > +	u32 cc_status;
> > +
> > +	lockdep_assert_held(&npem->lock);
> > +
> > +	/* This bit is always required */
> > +	ctrl = inds | PCI_NPEM_CTRL_ENABLE;
> > +
> > +	ret = npem_write_ctrl(npem, ctrl);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/*
> > +	 * For the case where a NPEM command has not completed immediately,
> > +	 * it is recommended that software not continuously “spin” on
> > polling
> > +	 * the status register, but rather poll under interrupt at a
> > reduced
> > +	 * rate; for example at 10 ms intervals.
> > +	 *
> > +	 * PCIe r6.1 sec 6.28 "Implementation Note: Software Polling of
> > NPEM
> > +	 * Command Completed"
> > +	 */
> > +	ret = read_poll_timeout(npem_read_reg, ret_val,
> > +				ret_val || (cc_status &
> > PCI_NPEM_STATUS_CC),
> > +				10 * USEC_PER_MSEC, USEC_PER_SEC, false,
> > npem,
> > +				PCI_NPEM_STATUS, &cc_status);
> > +	if (ret)
> > +		return ret_val;  
> 
> Will this work as intended?
> 
> If ret_val gets set, cond in read_poll_timeout() is true and it returns 0 
> so the return branch is not taken.

> 
> Also, when read_poll_timeout() times out, ret_val might not be non-zero.

Yes, it is good catch thanks! What about?

	if (ret)
		return ret;
	if (ret_val)
		return ret_val;

If ret is set it means that it times out- we should return that to caller.

If ret_val is set it means that we received error in npem_read_reg()- we should
return that (device probably is unreachable).

If read_val is set then we are less interested in ret because error from
npem_read_reg() function is more critical, so it is "acceptable" to have ret = 0
in this case.

Mariusz





[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux