Re: [PATCH RFC net-next v3 21/27] net: pse-pd: Add support for getting and setting port priority

Oleksij Rempel <o.rempel@xxxxxxxxxxxxxx> · Wed, 27 Nov 2024 10:30:43 +0100

On Tue, Nov 26, 2024 at 04:52:28PM +0100, Kory Maincent wrote:
> On Tue, 26 Nov 2024 16:31:55 +0100
> Kory Maincent <kory.maincent@xxxxxxxxxxx> wrote:
> 
> > Hello Oleksij,
> > 
> > Thanks for your quick reviews!
> > 
> > On Tue, 26 Nov 2024 09:38:27 +0100
> > Oleksij Rempel <o.rempel@xxxxxxxxxxxxxx> wrote:
> > 
> > > > +int pse_ethtool_set_prio_mode(struct pse_control *psec,
> > > > +			      struct netlink_ext_ack *extack,
> > > > +			      u32 prio_mode)
> > > > +{
> > > > +	struct pse_controller_dev *pcdev = psec->pcdev;
> > > > +	const struct pse_controller_ops *ops;
> > > > +	int ret = 0, i;
> > > > +
> > > > +	if (!(prio_mode & pcdev->port_prio_supp_modes)) {
> > > > +		NL_SET_ERR_MSG(extack, "priority mode not supported");
> > > > +		return -EOPNOTSUPP;
> > > > +	}
> > > > +
> > > > +	if (!pcdev->pi[psec->id].pw_d) {
> > > > +		NL_SET_ERR_MSG(extack, "no power domain attached");
> > > > +		return -EOPNOTSUPP;
> > > > +	}
> > > > +
> > > > +	/* ETHTOOL_PSE_PORT_PRIO_DISABLED can't be ORed with another mode
> > > > */
> > > > +	if (prio_mode & ETHTOOL_PSE_PORT_PRIO_DISABLED &&
> > > > +	    prio_mode & ~ETHTOOL_PSE_PORT_PRIO_DISABLED) {
> > > > +		NL_SET_ERR_MSG(extack,
> > > > +			       "port priority can't be enabled and
> > > > disabled simultaneously");
> > > > +		return -EINVAL;
> > > > +	}
> > > > +
> > > > +	ops = psec->pcdev->ops;
> > > > +
> > > > +	/* We don't want priority mode change in the middle of an
> > > > +	 * enable/disable call
> > > > +	 */
> > > > +	mutex_lock(&pcdev->lock);
> > > > +	pcdev->pi[psec->id].pw_d->port_prio_mode = prio_mode;    
> > > 
> > > In proposed implementation we have can set policies per port, but it
> > > will affect complete domain. This is not good. It feels like a separate
> > > challenge with extra discussion and work. I would recommend not to
> > > implement policy setting right now.
> > > 
> > > If you will decide to implement setting of policies anyway, then we need
> > > to discuss the interface.
> > > - If the policy should be done per domain, then we will need a separate
> > >   interface to interact with domains.
> > >   Pro: seems to be easier to implement.
> > > - If we will go with policy per port, wich would make sense too, then
> > >   some rework of this patch is needed.
> > >   Pro: can combine best of both strategies: set ports with wide load
> > >   range to static strategy and use dynamic strategy on other ports.
> 
> We already talked about it but a policies per port seems irrelevant to me.
> https://lore.kernel.org/netdev/ZySR75i3BEzNbjnv@xxxxxxxxxxxxxx/
> How do we compare the priority value of ports that use different budget
> strategy? How do we manage in the same power domain two ports with
> different budget strategies or disconnection policies?

Good question :)

> We indeed may need a separate interface to configure the PSE power domain
> budget strategies and disconnection policies.

And a way to upload everything in atomic way, but I see it as
optimization and can be done separately

> I think not being able to set the budget evaluation strategy is not relevant
> for now as we don't have PSE which could support both,

Both can be implemented for TI. By constantly polling the channel
current register, it should be possible to implement dynamic strategy.

> but being able to set the disconnection policies may be relevant.
> If we don't add this support to this series how do we decide which is the
> default disconnection policy supported?

Use hard coded one ¯\_(ツ)_/¯

Anyway, assuming you will decide to implement everything per port. Here
how I assume it would work.

Budget Evaluation Strategy: We have following modes for now: disabled, static, and
dynamic.
- Disabled: In this mode, the port is excluded from active budget evaluation. It
  allows the port to violate the budget and is intended primarily for testing
  purposes.

- Static: The static method is the safest option and should be used by default.
  When the static method is enabled for a port, the classification information
  (such as power class) is used for budget evaluation.

- Dynamic: If the dynamic method is used, the software will start with the
  classification as an initial step. After that, it will begin monitoring the
  port by polling the current information for the related channel. The system
  will likely use the maximum detected current and a threshold to update or
  reduce the budget allocation for the related port.

Right now I'm not sure about manual mode - for the case where classification
is not working properly.

Disconnection Policy: can only be applied if the Budget Evaluation Strategy
is not disabled.

- Disabled: The port is not subject to disconnection under any circumstances.
  This can be used for critical devices that need to remain powered at all
  times, or for administrative purposes to avoid unexpected disconnections.
  Possible use cases include testing, allowing user space to implement a
  disconnection policy, or pinning certain ports as the highest priority that
  cannot be overwritten for critical devices. Another use case could be
  temporarily addressing misconfigured priorities: an admin could set the
  policy to disabled, adjust the priority, and use a disconnection policy
  resolution status interface (currently not implemented) to verify if the
  port would be disconnected with the updated priority settings, and then set
  the policy back to a non-disabled state.

- Port Index-Based Policy: "I Know What I Do, but I Do Not Have Permission to
  Configure Software"

  Behavior: With this approach, the port index becomes the way to determine the
  priority of connections. Users can manage priorities simply by deciding which
  port they plug into:

    Lower-Indexed Ports: These ports have higher priority, meaning devices
    plugged into these ports are protected from disconnection until absolutely
    necessary.  Higher-Indexed Ports: These ports are more likely to be
    disconnected first when a power budget violation occurs.

    Use Cases:
	Structured Environment Without Full Control: Ideal for situations where
        users understand the importance of the devices but do not have admin
        access to configure software settings.

        Mechanical Administration: Users can simply re-plug devices into
        different ports to change their priority, using the port layout itself
        as a mechanism for priority management. This allows an effective but
        simple way of reassigning criticality without touching the software
        layer.

- LRC-Based Policy: "I Have No Idea About the Architecture, and I Just Enabled
  This One Device"

    Behavior: In this case, the policy targets recent user actions, assuming
    the user has minimal context about the system's architecture. The most
    recently connected device is likely the least critical and is therefore
    disconnected first.

    Use Cases:
        Chaotic Environment: Suitable for environments where users do not
        know the priority structure and are randomly plugging in devices,
        often without understanding the power budget implications.

        Immediate Feedback: If a user connects a device and it is quickly
        disconnected, they are more likely to notice and either try connecting
        at a different time or consult someone for guidance.

- Round-Robin (RR) Policy: "I Do Not Care, but I Want You All to Suffer Equally"

    Behavior: In the Round-Robin policy, all ports are treated equally without
    any specific prioritization. The disconnection burden is distributed evenly
    among all eligible ports, ensuring that no specific port is repeatedly
    penalized. This policy ensures fairness by giving every port an equal
    chance of being disconnected.

    Use Cases:
        Fair Distribution in Shared Environments: Suitable for environments
        where all devices are of similar importance, and fairness is key. This
        ensures that no single user or device consistently bears the impact of
        a budget violation.

	Non-Critical Setup: Ideal for situations where there are no critical
        devices, and all ports should have an equal chance of being
        disconnected. It provides a simple and fair mechanism for managing
        power resources without requiring specific prioritization.

Regarding the mixing of disconnection policies within one power domain, I've
thought more about how we could implement this, particularly when we focus on
three primary types of policies: Least Recently Connected (LRC), Port
Index-Based (Index), and Round-Robin (RR).

Defined Priority Order:

To make the system straightforward and predictable, we could define a hard
execution priority order for these policies:

- LRC - Highest priority: If a port has LRC enabled, it will be considered for
  disconnection first. This ensures that the least recently connected device is
  always targeted first, which works well for devices that are likely temporary
  or less critical.

- Index-Based - Second priority: Once no more LRC ports are eligible for
  disconnection, the system evaluates Index-Based ports. Ports with a higher
  physical index will be prioritized for disconnection, allowing for some manual
  control by re-arranging device connections.

- Round-Robin (RR) - Lowest priority: Finally, Round-Robin is applied to ensure
  fairness among the remaining eligible ports. This policy cycles through the
  ports to distribute the disconnection burden evenly.

User Configuration

In terms of user configuration:

Users only need to set the top allowed priority for each port. For example, if
a port is set to LRC, it will always be considered first for disconnection
during a budget violation. The connection order of all LRC ports should be
preserved.

If a port is set to Index, it will be preserved until all LRC ports are
disconnected.

Setting a port to RR will make it the last in line for disconnection, thus
ensuring the fairest distribution when other more prioritized policies have
already been applied. However, in practice, it may never be executed if all
ports have higher priority policies.

-- 
Pengutronix e.K.                           |                             |
Steuerwalder Str. 21                       | http://www.pengutronix.de/  |
31137 Hildesheim, Germany                  | Phone: +49-5121-206917-0    |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |