On Tue, Mar 04, 2025 at 11:18:55AM +0100, Kory Maincent wrote: > From: Kory Maincent (Dent Project) <kory.maincent@xxxxxxxxxxx> > +/** > + * pse_disable_pi_prio - Disable all PIs of a given priority inside a PSE > + * power domain > + * @pcdev: a pointer to the PSE > + * @pw_d: a pointer to the PSE power domain > + * @prio: priority > + * > + * Return: 0 on success and failure value on error > + */ > +static int pse_disable_pi_prio(struct pse_controller_dev *pcdev, > + struct pse_power_domain *pw_d, > + int prio) > +{ > + int i; > + Should we lock the pi[] array at some level? > + for (i = 0; i < pcdev->nr_lines; i++) { > + int ret; > + > + if (pcdev->pi[i].prio != prio || > + pcdev->pi[i].pw_d != pw_d || > + !pcdev->pi[i].admin_state_enabled) > + continue; > + > + ret = pse_disable_pi_pol(pcdev, i); If the PSE has many lower-priority ports, the same set of ports could be repeatedly shut down while higher-priority ports keep power indefinitely. This could result in a starvation issue, where lower-priority group of ports may never get a chance to stay enabled, even if power briefly becomes available. To fix this, we could: - Disallow identical priorities in static mode to ensure a clear shutdown order. - Modify pse_disable_pi_prio() to track freed power and stop disabling once enough power is recovered. static int pse_disable_pi_prio(struct pse_controller_dev *pcdev, struct pse_power_domain *pw_d, int prio, int required_power) { int i, ret; int freed_power = 0; mutex_lock(&pcdev->lock); for (i = 0; i < pcdev->nr_lines; i++) { if (pcdev->pi[i].prio != prio || pcdev->pi[i].pw_d != pw_d || !pcdev->pi[i].admin_state_enabled) continue; ret = pse_disable_pi_pol(pcdev, i); if (ret == 0) freed_power += pcdev->pi[i].pw_allocated_mW; /* Stop once we have freed enough power */ if (freed_power >= required_power) break; } mutex_unlock(&pcdev->lock); return ret; } The current approach still introduces an implicit priority based on loop execution order, but since it's predictable, it can serve as a reasonable default. > + if (ret) > + return ret; > + } > + > + return 0; > +} .... > +/** > + * pse_ethtool_set_prio - Set PSE PI priority according to the budget > + * evaluation strategy > + * @psec: PSE control pointer > + * @extack: extack for reporting useful error messages > + * @prio: priovity value > + * > + * Return: 0 on success and failure value on error > + */ > +int pse_ethtool_set_prio(struct pse_control *psec, > + struct netlink_ext_ack *extack, > + unsigned int prio) > +{ > + struct pse_controller_dev *pcdev = psec->pcdev; > + const struct pse_controller_ops *ops; > + int ret = 0; > + > + if (!pcdev->pi[psec->id].pw_d) { > + NL_SET_ERR_MSG(extack, "no power domain attached"); > + return -EOPNOTSUPP; > + } > + > + /* We don't want priority change in the middle of an > + * enable/disable call or a priority mode change > + */ > + mutex_lock(&pcdev->lock); > + switch (pcdev->pi[psec->id].pw_d->budget_eval_strategy) { > + case ETHTOOL_PSE_BUDGET_EVAL_STRAT_STATIC: > + if (prio > pcdev->nr_lines) { > + NL_SET_ERR_MSG_FMT(extack, > + "priority %d exceed priority max %d", > + prio, pcdev->nr_lines); > + ret = -ERANGE; > + goto out; > + } > + > + pcdev->pi[psec->id].prio = prio; In case we already out of the budget, we will need to re-evaluate the prios. New configuration may affect state of ports. Potentially we may need a bulk interface to assign prios, to speed-up reconfiguration. But it is not needed right now. > + break; > + > + case ETHTOOL_PSE_BUDGET_EVAL_STRAT_DYNAMIC: > + ops = psec->pcdev->ops; > + if (!ops->pi_set_prio) { > + NL_SET_ERR_MSG(extack, > + "pse driver does not support setting port priority"); > + ret = -EOPNOTSUPP; > + goto out; > + } > + > + if (prio > pcdev->pis_prio_max) { > + NL_SET_ERR_MSG_FMT(extack, > + "priority %d exceed priority max %d", > + prio, pcdev->pis_prio_max); > + ret = -ERANGE; > + goto out; > + } > + > + ret = ops->pi_set_prio(pcdev, psec->id, prio); Here too, but in case of microchip PSE it will happen in the firmware. May be add here a comment that currently it is done in firmware and to be extended for the kernel based implementation. > + break; > + > + default: > + ret = -EOPNOTSUPP; > + } > + > +out: > + mutex_unlock(&pcdev->lock); > + return ret; > +} > +EXPORT_SYMBOL_GPL(pse_ethtool_set_prio); -- Pengutronix e.K. | | Steuerwalder Str. 21 | http://www.pengutronix.de/ | 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 | Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |