RE: [net-next 1/3] net: dsa: optimize tx timestamp request handling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Vladimir Oltean <olteanv@xxxxxxxxx>
> Sent: 2021年4月18日 17:19
> To: Y.b. Lu <yangbo.lu@xxxxxxx>
> Cc: netdev@xxxxxxxxxxxxxxx; Richard Cochran <richardcochran@xxxxxxxxx>;
> Vladimir Oltean <vladimir.oltean@xxxxxxx>; David S . Miller
> <davem@xxxxxxxxxxxxx>; Jakub Kicinski <kuba@xxxxxxxxxx>; Jonathan Corbet
> <corbet@xxxxxxx>; Kurt Kanzenbach <kurt@xxxxxxxxxxxxx>; Andrew Lunn
> <andrew@xxxxxxx>; Vivien Didelot <vivien.didelot@xxxxxxxxx>; Florian
> Fainelli <f.fainelli@xxxxxxxxx>; Claudiu Manoil <claudiu.manoil@xxxxxxx>;
> Alexandre Belloni <alexandre.belloni@xxxxxxxxxxx>;
> UNGLinuxDriver@xxxxxxxxxxxxx; linux-doc@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [net-next 1/3] net: dsa: optimize tx timestamp request handling
> 
> On Fri, Apr 16, 2021 at 08:36:53PM +0800, Yangbo Lu wrote:
> > Optimization could be done on dsa_skb_tx_timestamp(), and dsa device
> > drivers should adapt to it.
> >
> > - Check SKBTX_HW_TSTAMP request flag at the very beginning, instead of in
> >   port_txtstamp, so that most skbs not requiring tx timestamp just return.
> 
> Agree that this is a trivial performance optimization with no downside that we
> should be making.
> 
> > - No longer to identify PTP packets, and limit tx timestamping only for PTP
> >   packets. If device driver likes, let device driver do.
> 
> Agree that DSA has a way too heavy hand in imposing upon the driver which
> packets should be timestampable and which ones shouldn't.
> 
> For example, I have a latency measurement tool called isochron which is based
> on hardware timestamping of non-PTP packets (in order to not disturb the PTP
> state machines):
> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.
> com%2Fvladimiroltean%2Ftsn-scripts&amp;data=04%7C01%7Cyangbo.lu%40
> nxp.com%7C3772f63deb6f4491933208d9024af750%7C686ea1d3bc2b4c6fa9
> 2cd99c5c301635%7C0%7C0%7C637543343267914018%7CUnknown%7CTW
> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXV
> CI6Mn0%3D%7C1000&amp;sdata=deOn03j6biPEUwppNkv%2F9if1u5HIdLEtzz
> AkWW0p1rc%3D&amp;reserved=0
> 
> I can't use it on DSA interfaces, for rather artificial reasons.
> 
> > - It is a waste to clone skb directly in dsa_skb_tx_timestamp().
> >   For one-step timestamping, a clone is not needed. For any failure of
> >   port_txtstamp (this may usually happen), the skb clone has to be freed.
> >   So put skb cloning into port_txtstamp where it really needs.
> 
> Mostly agree. For two-step timestamping, it is an operation which all drivers
> need to do, so it is in the common potion. If we want to support one-step, we
> need to avoid cloning the PTP packets.
> 

Thanks a lot for your ACK.

> > Signed-off-by: Yangbo Lu <yangbo.lu@xxxxxxx>
> > ---
> >  Documentation/networking/timestamping.rst     |  7 +++++--
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.c   | 20 ++++++++++++------
> >  .../net/dsa/hirschmann/hellcreek_hwtstamp.h   |  2 +-
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.c          | 21
> +++++++++++++------
> >  drivers/net/dsa/mv88e6xxx/hwtstamp.h          |  6 +++---
> >  drivers/net/dsa/ocelot/felix.c                | 11 ++++++----
> >  drivers/net/dsa/sja1105/sja1105_ptp.c         |  6 +++++-
> >  drivers/net/dsa/sja1105/sja1105_ptp.h         |  2 +-
> >  include/net/dsa.h                             |  2 +-
> >  net/dsa/slave.c                               | 20 +++++-------------
> >  10 files changed, 57 insertions(+), 40 deletions(-)
> >
> > diff --git a/Documentation/networking/timestamping.rst
> > b/Documentation/networking/timestamping.rst
> > index f682e88fa87e..7f04a699a5d1 100644
> > --- a/Documentation/networking/timestamping.rst
> > +++ b/Documentation/networking/timestamping.rst
> > @@ -635,8 +635,8 @@ in generic code: a BPF classifier
> > (``ptp_classify_raw``) is used to identify  PTP event messages (any
> > other packets, including PTP general messages, are not  timestamped), and
> provides two hooks to drivers:
> >
> > -- ``.port_txtstamp()``: The driver is passed a clone of the
> > timestampable skb
> > -  to be transmitted, before actually transmitting it. Typically, a
> > switch will
> > +- ``.port_txtstamp()``: A clone of the timestampable skb to be
> > +transmitted
> > +  is needed, before actually transmitting it. Typically, a switch
> > +will
> >    have a PTP TX timestamp register (or sometimes a FIFO) where the
> timestamp
> >    becomes available. There may be an IRQ that is raised upon this
> timestamp's
> >    availability, or the driver might have to poll after invoking @@
> > -645,6 +645,9 @@ timestamped), and provides two hooks to drivers:
> >    later use (when the timestamp becomes available). Each skb is annotated
> with
> >    a pointer to its clone, in ``DSA_SKB_CB(skb)->clone``, to ease the driver's
> >    job of keeping track of which clone belongs to which skb.
> > +  But one-step timestamping request is handled differently with above
> > + two-step  timestamping. The skb clone is no longer needed since
> > + hardware will insert  TX time information on packet during egress.
> 
> Bonus points for updating the documentation, but I don't quite like the end
> result. Please feel free to restructure more, in order to have a clearer and more
> coherent explanation.
> 
> Also, this paragraph from right above is no longer true:
> 
> 	In code, DSA provides for most of the infrastructure for timestamping
> already,
> 	in generic code: a BPF classifier (``ptp_classify_raw``) is used to identify
> 	PTP event messages (any other packets, including PTP general messages,
> are not
> 	timestamped), and provides two hooks to drivers:
> 
> It's nothing like that anymore. It's more of a passthrough now with your
> changes, the BPF classifier is not run by the DSA core but optionally by
> individual taggers.
> 
> Here is my attempt of rewriting this documentation paragraph, feel free to
> take which parts you consider relevant:
> 
> -----------------------------[cut here]-----------------------------
> 
> In the generic layer, DSA provides the following infrastructure for PTP
> timestamping:
> 
> - ``.port_txtstamp()``: a hook called prior to the transmission of
>   packets with a hardware TX timestamping request from user space.
>   This is required for two-step timestamping, since the hardware
>   timestamp becomes available after the actual MAC transmission, so the
>   driver must be prepared to correlate the timestamp with the original
>   packet so that it can re-enqueue the packet back into the socket's
>   error queue. To save the packet for when the timestamp becomes
>   available, the driver can call ``skb_clone_sk`` and save the resulting
>   clone in ``DSA_SKB_CB(skb)->clone``. Typically, a switch will have a
>   PTP TX timestamp register (or sometimes a FIFO) where the timestamp
>   becomes available. In case of a FIFO, the hardware might store
>   key-value pairs of PTP sequence ID/message type/domain number and the
>   actual timestamp. To perform the correlation correctly between the
>   packets in a queue waiting for timestamping and the actual timestamps,
>   drivers can use a BPF classifier (``ptp_classify_raw``) to identify
>   the PTP transport type, and ``ptp_parse_header`` to interpret the PTP
>   header fields. There may be an IRQ that is raised upon this
>   timestamp's availability, or the driver might have to poll after
>   invoking ``dev_queue_xmit()`` towards the host interface.
>   One-step TX timestamping do not require packet cloning, since there is
>   no follow-up message required by the PTP protocol (because the
>   TX timestamp is embedded into the packet by the MAC), and therefore
>   user space does not expect the packet annotated with the TX timestamp
>   to be re-enqueued into its socket's error queue.
> 
> - ``.port_rxtstamp()``: On RX, the BPF classifier is run by DSA to
>   identify PTP event messages (any other packets, including PTP general
>   messages, are not timestamped). The original (and only) timestampable
>   skb is provided to the driver, for it to annotate it with a timestamp,
>   if that is immediately available, or defer to later. On reception,
>   timestamps might either be available in-band (through metadata in the
>   DSA header, or attached in other ways to the packet), or out-of-band
>   (through another RX timestamping FIFO). Deferral on RX is typically
>   necessary when retrieving the timestamp needs a sleepable context. In
>   that case, it is the responsibility of the DSA driver to call
>   ``netif_rx_ni()`` on the freshly timestamped skb.
> 
> -----------------------------[cut here]-----------------------------
> 

That's really considerate and accuracy description. I will update with this. (Will adjust if we can drop dsa_skb_cb in dsa core.)
Thanks.

> >
> >  - ``.port_rxtstamp()``: The original (and only) timestampable skb is provided
> >    to the driver, for it to annotate it with a timestamp, if that is
> > immediately diff --git
> > a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > index 69dd9a2e8bb6..2ff4b7c08b72 100644
> > --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.c
> > @@ -374,31 +374,39 @@ long hellcreek_hwtstamp_work(struct
> > ptp_clock_info *ptp)  }
> >
> >  bool hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type)
> > +			     struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct hellcreek *hellcreek = ds->priv;
> >  	struct hellcreek_port_hwtstamp *ps;
> >  	struct ptp_header *hdr;
> > +	unsigned int type;
> >
> >  	ps = &hellcreek->ports[port].port_hwtstamp;
> >
> > -	/* Check if the driver is expected to do HW timestamping */
> > -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> > +	type = ptp_classify_raw(skb);
> > +	if (type == PTP_CLASS_NONE)
> >  		return false;
> >
> >  	/* Make sure the message is a PTP message that needs to be
> timestamped
> >  	 * and the interaction with the HW timestamping is enabled. If not, stop
> >  	 * here
> >  	 */
> > -	hdr = hellcreek_should_tstamp(hellcreek, port, clone, type);
> > +	hdr = hellcreek_should_tstamp(hellcreek, port, skb, type);
> >  	if (!hdr)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	if (test_and_set_bit_lock(HELLCREEK_HWTSTAMP_TX_IN_PROGRESS,
> > -				  &ps->state))
> > +				  &ps->state)) {
> > +		kfree_skb(*clone);
> > +		*clone = NULL;
> >  		return false;
> > +	}
> >
> > -	ps->tx_skb = clone;
> > +	ps->tx_skb = *clone;
> >
> >  	/* store the number of ticks occurred since system start-up till this
> >  	 * moment
> > diff --git a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > index c0745ffa1ebb..58cc96642076 100644
> > --- a/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > +++ b/drivers/net/dsa/hirschmann/hellcreek_hwtstamp.h
> > @@ -45,7 +45,7 @@ int hellcreek_port_hwtstamp_get(struct dsa_switch
> > *ds, int port,  bool hellcreek_port_rxtstamp(struct dsa_switch *ds, int port,
> >  			     struct sk_buff *clone, unsigned int type);  bool
> > hellcreek_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type);
> > +			     struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int hellcreek_get_ts_info(struct dsa_switch *ds, int port,
> >  			  struct ethtool_ts_info *info);
> > diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > index 094d17a1d037..280a95962861 100644
> > --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.c
> > @@ -469,24 +469,33 @@ long mv88e6xxx_hwtstamp_work(struct
> > ptp_clock_info *ptp)  }
> >
> >  bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type)
> > +			     struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct mv88e6xxx_chip *chip = ds->priv;
> >  	struct mv88e6xxx_port_hwtstamp *ps = &chip->port_hwtstamp[port];
> >  	struct ptp_header *hdr;
> > +	unsigned int type;
> >
> > -	if (!(skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP))
> > -		return false;
> > +	type = ptp_classify_raw(skb);
> > +	if (type == PTP_CLASS_NONE)
> > +		return false
> >
> > -	hdr = mv88e6xxx_should_tstamp(chip, port, clone, type);
> > +	hdr = mv88e6xxx_should_tstamp(chip, port, skb, type);
> >  	if (!hdr)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	if (test_and_set_bit_lock(MV88E6XXX_HWTSTAMP_TX_IN_PROGRESS,
> > -				  &ps->state))
> > +				  &ps->state)) {
> > +		kfree_skb(*clone);
> > +		*clone = NULL;
> >  		return false;
> > +	}
> >
> > -	ps->tx_skb = clone;
> > +	ps->tx_skb = *clone;
> >  	ps->tx_tstamp_start = jiffies;
> >  	ps->tx_seq_id = be16_to_cpu(hdr->sequence_id);
> >
> > diff --git a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > index 9da9f197ba02..da2b253334d0 100644
> > --- a/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > +++ b/drivers/net/dsa/mv88e6xxx/hwtstamp.h
> > @@ -118,7 +118,7 @@ int mv88e6xxx_port_hwtstamp_get(struct
> dsa_switch
> > *ds, int port,  bool mv88e6xxx_port_rxtstamp(struct dsa_switch *ds, int
> port,
> >  			     struct sk_buff *clone, unsigned int type);  bool
> > mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -			     struct sk_buff *clone, unsigned int type);
> > +			     struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int mv88e6xxx_get_ts_info(struct dsa_switch *ds, int port,
> >  			  struct ethtool_ts_info *info);
> > @@ -152,8 +152,8 @@ static inline bool mv88e6xxx_port_rxtstamp(struct
> > dsa_switch *ds, int port,  }
> >
> >  static inline bool mv88e6xxx_port_txtstamp(struct dsa_switch *ds, int port,
> > -					   struct sk_buff *clone,
> > -					   unsigned int type)
> > +					   struct sk_buff *skb,
> > +					   struct sk_buff **clone)
> >  {
> >  	return false;
> >  }
> > diff --git a/drivers/net/dsa/ocelot/felix.c
> > b/drivers/net/dsa/ocelot/felix.c index 6b5442be0230..cdec2f5e271c
> > 100644
> > --- a/drivers/net/dsa/ocelot/felix.c
> > +++ b/drivers/net/dsa/ocelot/felix.c
> > @@ -1396,14 +1396,17 @@ static bool felix_rxtstamp(struct dsa_switch
> > *ds, int port,  }
> >
> >  static bool felix_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *clone, unsigned int type)
> > +			   struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct ocelot *ocelot = ds->priv;
> >  	struct ocelot_port *ocelot_port = ocelot->ports[port];
> >
> > -	if (ocelot->ptp && (skb_shinfo(clone)->tx_flags & SKBTX_HW_TSTAMP)
> &&
> > -	    ocelot_port->ptp_cmd == IFH_REW_OP_TWO_STEP_PTP) {
> > -		ocelot_port_add_txtstamp_skb(ocelot, port, clone);
> > +	if (ocelot->ptp && ocelot_port->ptp_cmd ==
> IFH_REW_OP_TWO_STEP_PTP) {
> > +		*clone = skb_clone_sk(skb);
> > +		if (!(*clone))
> > +			return false;
> > +
> > +		ocelot_port_add_txtstamp_skb(ocelot, port, *clone);
> >  		return true;
> >  	}
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c
> > b/drivers/net/dsa/sja1105/sja1105_ptp.c
> > index 1b90570b257b..6a1f854a8c33 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_ptp.c
> > +++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
> > @@ -436,7 +436,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds,
> int port,
> >   * callback, where we will timestamp it synchronously.
> >   */
> >  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *skb, unsigned int type)
> > +			   struct sk_buff *skb, struct sk_buff **clone)
> >  {
> >  	struct sja1105_private *priv = ds->priv;
> >  	struct sja1105_port *sp = &priv->ports[port]; @@ -444,6 +444,10 @@
> > bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> >  	if (!sp->hwts_tx_en)
> >  		return false;
> >
> > +	*clone = skb_clone_sk(skb);
> > +	if (!(*clone))
> > +		return false;
> > +
> >  	return true;
> >  }
> >
> > diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.h
> > b/drivers/net/dsa/sja1105/sja1105_ptp.h
> > index 3daa33e98e77..ab80b73219cb 100644
> > --- a/drivers/net/dsa/sja1105/sja1105_ptp.h
> > +++ b/drivers/net/dsa/sja1105/sja1105_ptp.h
> > @@ -105,7 +105,7 @@ bool sja1105_port_rxtstamp(struct dsa_switch *ds,
> int port,
> >  			   struct sk_buff *skb, unsigned int type);
> >
> >  bool sja1105_port_txtstamp(struct dsa_switch *ds, int port,
> > -			   struct sk_buff *skb, unsigned int type);
> > +			   struct sk_buff *skb, struct sk_buff **clone);
> >
> >  int sja1105_hwtstamp_get(struct dsa_switch *ds, int port, struct
> > ifreq *ifr);
> >
> > diff --git a/include/net/dsa.h b/include/net/dsa.h index
> > 1259b0f40684..c8415c324e27 100644
> > --- a/include/net/dsa.h
> > +++ b/include/net/dsa.h
> > @@ -734,7 +734,7 @@ struct dsa_switch_ops {
> >  	int	(*port_hwtstamp_set)(struct dsa_switch *ds, int port,
> >  				     struct ifreq *ifr);
> >  	bool	(*port_txtstamp)(struct dsa_switch *ds, int port,
> > -				 struct sk_buff *clone, unsigned int type);
> > +				 struct sk_buff *skb, struct sk_buff **clone);
> 
> How about not passing "clone" back to DSA as an argument by reference, but
> instead require the driver to populate DSA_SKB_CB(skb)->clone if it needs to do
> so?
> 
> Also, how about changing the return type to void? Returning true or false
> makes no difference.
> 
> >  	bool	(*port_rxtstamp)(struct dsa_switch *ds, int port,
> >  				 struct sk_buff *skb, unsigned int type);
> >
> > diff --git a/net/dsa/slave.c b/net/dsa/slave.c index
> > 9300cb66e500..5b746a903ef4 100644
> > --- a/net/dsa/slave.c
> > +++ b/net/dsa/slave.c
> > @@ -19,7 +19,6 @@
> >  #include <linux/if_bridge.h>
> >  #include <linux/if_hsr.h>
> >  #include <linux/netpoll.h>
> > -#include <linux/ptp_classify.h>
> >
> >  #include "dsa_priv.h"
> >
> > @@ -555,26 +554,19 @@ static void dsa_skb_tx_timestamp(struct
> dsa_slave_priv *p,
> >  				 struct sk_buff *skb)
> >  {
> >  	struct dsa_switch *ds = p->dp->ds;
> > -	struct sk_buff *clone;
> > -	unsigned int type;
> > +	struct sk_buff *clone = NULL;
> >
> > -	type = ptp_classify_raw(skb);
> > -	if (type == PTP_CLASS_NONE)
> > +	if (!(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP))
> >  		return;
> >
> >  	if (!ds->ops->port_txtstamp)
> >  		return;
> >
> > -	clone = skb_clone_sk(skb);
> > -	if (!clone)
> > +	if (!ds->ops->port_txtstamp(ds, p->dp->index, skb, &clone))
> >  		return;
> >
> > -	if (ds->ops->port_txtstamp(ds, p->dp->index, clone, type)) {
> > +	if (clone)
> >  		DSA_SKB_CB(skb)->clone = clone;
> > -		return;
> > -	}
> > -
> > -	kfree_skb(clone);
> >  }
> >
> >  netdev_tx_t dsa_enqueue_skb(struct sk_buff *skb, struct net_device
> > *dev) @@ -628,9 +620,7 @@ static netdev_tx_t dsa_slave_xmit(struct
> > sk_buff *skb, struct net_device *dev)
> >
> >  	DSA_SKB_CB(skb)->clone = NULL;
> >
> > -	/* Identify PTP protocol packets, clone them, and pass them to the
> > -	 * switch driver
> > -	 */
> > +	/* Handle tx timestamp request if has */
> 
> s/if has/if any/

Sorry. Will fix.

> 
> >  	dsa_skb_tx_timestamp(p, skb);
> >
> >  	if (dsa_realloc_skb(skb, dev)) {
> > --
> > 2.25.1
> >




[Index of Archives]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]     [Linux Resources]

  Powered by Linux