Re: [PATCH net-next v6 06/12] net: pse-pd: Add support for budget evaluation strategies

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 04, 2025 at 11:18:55AM +0100, Kory Maincent wrote:
> From: Kory Maincent (Dent Project) <kory.maincent@xxxxxxxxxxx>
> +/**
> + * pse_disable_pi_prio - Disable all PIs of a given priority inside a PSE
> + *			 power domain
> + * @pcdev: a pointer to the PSE
> + * @pw_d: a pointer to the PSE power domain
> + * @prio: priority
> + *
> + * Return: 0 on success and failure value on error
> + */
> +static int pse_disable_pi_prio(struct pse_controller_dev *pcdev,
> +			       struct pse_power_domain *pw_d,
> +			       int prio)
> +{
> +	int i;
> +

Should we lock the pi[] array at some level?

> +	for (i = 0; i < pcdev->nr_lines; i++) {
> +		int ret;
> +
> +		if (pcdev->pi[i].prio != prio ||
> +		    pcdev->pi[i].pw_d != pw_d ||
> +		    !pcdev->pi[i].admin_state_enabled)
> +			continue;
> +
> +		ret = pse_disable_pi_pol(pcdev, i);

If the PSE has many lower-priority ports, the same set of ports could be
repeatedly shut down while higher-priority ports keep power
indefinitely.

This could result in a starvation issue, where lower-priority group of
ports may never get a chance to stay enabled, even if power briefly
becomes available.

To fix this, we could:
- Disallow identical priorities in static mode to ensure a clear
shutdown order.
- Modify pse_disable_pi_prio() to track freed power and stop
disabling once enough power is recovered.

static int pse_disable_pi_prio(struct pse_controller_dev *pcdev,
                               struct pse_power_domain *pw_d,
                               int prio, int required_power)
{
    int i, ret;
    int freed_power = 0;

    mutex_lock(&pcdev->lock);

    for (i = 0; i < pcdev->nr_lines; i++) {
        if (pcdev->pi[i].prio != prio ||
            pcdev->pi[i].pw_d != pw_d ||
            !pcdev->pi[i].admin_state_enabled)
            continue;

        ret = pse_disable_pi_pol(pcdev, i);
        if (ret == 0)
            freed_power += pcdev->pi[i].pw_allocated_mW;

        /* Stop once we have freed enough power */
        if (freed_power >= required_power)
            break;
    }

    mutex_unlock(&pcdev->lock);
    return ret;
}

The current approach still introduces an implicit priority based on loop
execution order, but since it's predictable, it can serve as a reasonable
default.

> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}

....

> +/**
> + * pse_ethtool_set_prio - Set PSE PI priority according to the budget
> + *			  evaluation strategy
> + * @psec: PSE control pointer
> + * @extack: extack for reporting useful error messages
> + * @prio: priovity value
> + *
> + * Return: 0 on success and failure value on error
> + */
> +int pse_ethtool_set_prio(struct pse_control *psec,
> +			 struct netlink_ext_ack *extack,
> +			 unsigned int prio)
> +{
> +	struct pse_controller_dev *pcdev = psec->pcdev;
> +	const struct pse_controller_ops *ops;
> +	int ret = 0;
> +
> +	if (!pcdev->pi[psec->id].pw_d) {
> +		NL_SET_ERR_MSG(extack, "no power domain attached");
> +		return -EOPNOTSUPP;
> +	}
> +
> +	/* We don't want priority change in the middle of an
> +	 * enable/disable call or a priority mode change
> +	 */
> +	mutex_lock(&pcdev->lock);
> +	switch (pcdev->pi[psec->id].pw_d->budget_eval_strategy) {
> +	case ETHTOOL_PSE_BUDGET_EVAL_STRAT_STATIC:
> +		if (prio > pcdev->nr_lines) {
> +			NL_SET_ERR_MSG_FMT(extack,
> +					   "priority %d exceed priority max %d",
> +					   prio, pcdev->nr_lines);
> +			ret = -ERANGE;
> +			goto out;
> +		}
> +
> +		pcdev->pi[psec->id].prio = prio;

In case we already out of the budget, we will need to re-evaluate the
prios. New configuration may affect state of ports.

Potentially we may need a bulk interface to assign prios, to speed-up
reconfiguration. But it is not needed right now.

> +		break;
> +
> +	case ETHTOOL_PSE_BUDGET_EVAL_STRAT_DYNAMIC:
> +		ops = psec->pcdev->ops;
> +		if (!ops->pi_set_prio) {
> +			NL_SET_ERR_MSG(extack,
> +				       "pse driver does not support setting port priority");
> +			ret = -EOPNOTSUPP;
> +			goto out;
> +		}
> +
> +		if (prio > pcdev->pis_prio_max) {
> +			NL_SET_ERR_MSG_FMT(extack,
> +					   "priority %d exceed priority max %d",
> +					   prio, pcdev->pis_prio_max);
> +			ret = -ERANGE;
> +			goto out;
> +		}
> +
> +		ret = ops->pi_set_prio(pcdev, psec->id, prio);

Here too, but in case of microchip PSE it will happen in the firmware.
May be add here a comment that currently it is done in firmware and to
be extended for the kernel based implementation.

> +		break;
> +
> +	default:
> +		ret = -EOPNOTSUPP;
> +	}
> +
> +out:
> +	mutex_unlock(&pcdev->lock);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(pse_ethtool_set_prio);
-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |




[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux