On Thu, Jul 04, 2024 at 10:56:51AM GMT, Adrian Moreno wrote: > ** Background ** > Currently, OVS supports several packet sampling mechanisms (sFlow, > per-bridge IPFIX, per-flow IPFIX). These end up being translated into a > userspace action that needs to be handled by ovs-vswitchd's handler > threads only to be forwarded to some third party application that > will somehow process the sample and provide observability on the > datapath. > > A particularly interesting use-case is controller-driven > per-flow IPFIX sampling where the OpenFlow controller can add metadata > to samples (via two 32bit integers) and this metadata is then available > to the sample-collecting system for correlation. > > ** Problem ** > The fact that sampled traffic share netlink sockets and handler thread > time with upcalls, apart from being a performance bottleneck in the > sample extraction itself, can severely compromise the datapath, > yielding this solution unfit for highly loaded production systems. > > Users are left with little options other than guessing what sampling > rate will be OK for their traffic pattern and system load and dealing > with the lost accuracy. > > Looking at available infrastructure, an obvious candidated would be > to use psample. However, it's current state does not help with the > use-case at stake because sampled packets do not contain user-defined > metadata. > > ** Proposal ** > This series is an attempt to fix this situation by extending the > existing psample infrastructure to carry a variable length > user-defined cookie. > > The main existing user of psample is tc's act_sample. It is also > extended to forward the action's cookie to psample. > > Finally, a new OVS action (OVS_SAMPLE_ATTR_PSAMPLE) is created. > It accepts a group and an optional cookie and uses psample to > multicast the packet and the metadata. > > -- > v8 -> v9: > - Rebased. > > v7 -> v8: > - Rebased > - Redirect flow insertion to /dev/null to avoid spat in test. > - Removed inline keyword in stub execute_psample_action function. > > v6 -> v7: > - Rebased > - Fixed typo in comment. > > v5 -> v6: > - Renamed emit_sample -> psample > - Addressed unused variable and conditionally compilation of function. > > v4 -> v5: > - Rebased. > - Removed lefover enum value and wrapped some long lines in selftests. > > v3 -> v4: > - Rebased. > - Addressed Jakub's comment on private and unused nla attributes. > > v2 -> v3: > - Addressed comments from Simon, Aaron and Ilya. > - Dropped probability propagation in nested sample actions. > - Dropped patch v2's 7/9 in favor of a userspace implementation and > consume skb if emit_sample is the last action, same as we do with > userspace. > - Split ovs-dpctl.py features in independent patches. > > v1 -> v2: > - Create a new action ("emit_sample") rather than reuse existing > "sample" one. > - Add probability semantics to psample's sampling rate. > - Store sampling probability in skb's cb area and use it in emit_sample. > - Test combining "emit_sample" with "trunc" > - Drop group_id filtering and tracepoint in psample. > > rfc_v2 -> v1: > - Accommodate Ilya's comments. > - Split OVS's attribute in two attributes and simplify internal > handling of psample arguments. > - Extend psample and tc with a user-defined cookie. > - Add a tracepoint to psample to facilitate troubleshooting. > > rfc_v1 -> rfc_v2: > - Use psample instead of a new OVS-only multicast group. > - Extend psample and tc with a user-defined cookie. > > Adrian Moreno (10): > net: psample: add user cookie > net: sched: act_sample: add action cookie to sample > net: psample: skip packet copy if no listeners > net: psample: allow using rate as probability > net: openvswitch: add psample action > net: openvswitch: store sampling probability in cb. > selftests: openvswitch: add psample action > selftests: openvswitch: add userspace parsing > selftests: openvswitch: parse trunc action > selftests: openvswitch: add psample test > > Documentation/netlink/specs/ovs_flow.yaml | 17 ++ > include/net/psample.h | 5 +- > include/uapi/linux/openvswitch.h | 31 +- > include/uapi/linux/psample.h | 11 +- > net/openvswitch/Kconfig | 1 + > net/openvswitch/actions.c | 66 ++++- > net/openvswitch/datapath.h | 3 + > net/openvswitch/flow_netlink.c | 32 ++- > net/openvswitch/vport.c | 1 + > net/psample/psample.c | 16 +- > net/sched/act_sample.c | 12 + > .../selftests/net/openvswitch/openvswitch.sh | 115 +++++++- > .../selftests/net/openvswitch/ovs-dpctl.py | 272 +++++++++++++++++- > 13 files changed, 566 insertions(+), 16 deletions(-) > > -- > 2.45.2 > Hi, Simon Horman has spotted that openvswitch.sh tests are failing in the debug executor: https://netdev.bots.linux.dev/contest.html?test=openvswitch-sh The failing tests are two: psample and upcall_interfaces. These two tests have a known source of instability (they use "sleep") that make them specially unreliable in slow systems. Aaron and I already discussed this and I'm working on a patch to make both tests more robust by adding a wait-and-retry mechanism. I hope this series can be considered regardless of this flaky tests. Thanks. Adrián