[PATCH net-next 00/10] devlink and mlx5: Introduce rate domains

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

This series introduces rate domains in devlink and mlx5
driver. Find detailed description by Cosmin below.

Regards,
Tariq


devlink objects support rate management for tx scheduling, which
involves maintaining a tree of rate nodes that corresponds to tx
schedulers in hardware. 'man devlink-rate' has the full details.

The tree of rate nodes is maintained per devlink object, protected by
the devlink lock.

There exists hardware capable of instantiating a tx scheduling tree
which spans multiple functions of the same physical device (and thus
devlink objects) and therefore the current API and locking scheme is
insufficient.

This patch series changes the devlink rate implementation and API to
allow supporting such hardware and managing tx scheduling trees across
multiple functions of a physical device.

Modeling this requires having devlink rate nodes with parents in other
devlink objects. A naive approach that relies on the current
one-lock-per-devlink model is impossible, as it would require in some
cases acquiring multiple devlink locks in the correct order.

The solution proposed is to move rates in a separate object named 'rate
domain'. Devlink objects create a private rate domain on init and
hardware that supports cross-function tx scheduling can switch to using
a shared rate domain for a set of devlink objects. Shared rate domains
have an additional lock serializing access to rate notes.
A new pair of devlink attributes is introduced for specifying a foreign
parent device as well as changes to the rate management devlink calls to
allow setting a rate node parent to the requested foreign parent device.
Finally, this API is used from mlx5 for NICs with the correct capability
bit to allow cross-function tx scheduling.

A note about net-shapers:
The net-shapers framework is completely orthogonal to this patch series.
net-shapers does shaping for tx queues, groups of queues and up to the
netdevice level. This patch series is for shaping across functions, so
it is strictly above the netdevice level in the shaping hierarchy.

This patch series was previously sent as an RFC ([1]).

Patches:

Small cleanup:
devlink: Remove unused param of devlink_rate_nodes_check

Introduce private rate domains:
devlink: Store devlink rates in a rate domain

Introduce rate domain locking (noop now as rate domains are private):
devlink: Serialize access to rate domains

Introduce shared rate domains and a global registry for them:
devlink: Introduce shared rate domains

Extend the devlink rate API with foreign parent devices:
devlink: Allow specifying parent device for rate commands
devlink: Allow rate node parents from other devlinks

Extends mlx5 implementation with the ability to share qos domains:
net/mlx5: qos: Introduce shared esw qos domains

Use the newly introduced stuff to support cross-function tx scheduling:
net/mlx5: qos: Support cross-esw tx scheduling
net/mlx5: qos: Init shared devlink rate domain

Finally, update documentation:
net/mlx5: Document devlink rates and cross-esw scheduling

[1] https://lore.kernel.org/netdev/20241113203317.2507537-1-cratiu@xxxxxxxxxx/


Cosmin Ratiu (10):
  devlink: Remove unused param of devlink_rate_nodes_check
  devlink: Store devlink rates in a rate domain
  devlink: Serialize access to rate domains
  devlink: Introduce shared rate domains
  devlink: Allow specifying parent device for rate commands
  devlink: Allow rate node parents from other devlinks
  net/mlx5: qos: Introduce shared esw qos domains
  net/mlx5: qos: Support cross-esw tx scheduling
  net/mlx5: qos: Init shared devlink rate domain
  net/mlx5: Document devlink rates and cross-esw scheduling

 Documentation/netlink/specs/devlink.yaml      |  18 +-
 .../networking/devlink/devlink-port.rst       |   2 +
 Documentation/networking/devlink/mlx5.rst     |  33 +++
 .../net/ethernet/mellanox/mlx5/core/esw/qos.c | 144 ++++++++++--
 include/net/devlink.h                         |   8 +
 include/uapi/linux/devlink.h                  |   3 +
 net/devlink/core.c                            |  86 ++++++-
 net/devlink/dev.c                             |   6 +-
 net/devlink/devl_internal.h                   |  34 ++-
 net/devlink/netlink.c                         |  74 ++++--
 net/devlink/netlink_gen.c                     |  20 +-
 net/devlink/netlink_gen.h                     |   7 +
 net/devlink/rate.c                            | 217 +++++++++++++-----
 13 files changed, 548 insertions(+), 104 deletions(-)


base-commit: 8dbf0c7556454b52af91bae305ca71500c31495c
-- 
2.45.0





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux