RE: [PATCH v2 net 9/9] net: enetc: fix the off-by-one issue in enetc_map_tx_tso_buffs()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Wed, Feb 19, 2025 at 01:42:47PM +0800, Wei Fang wrote:
> > There is an off-by-one issue for the err_chained_bd path, it will free
> > one more tx_swbd than expected. But there is no such issue for the
> > err_map_data path.
> 
> It's clear that one of err_chained_bd or err_map_data is wrong, because
> they operate with a different "count" but same "i". But how did you
> determine which one is wrong? Is it based on static analysis? Because I
> think the other one is wrong, more below.
> 
> > To fix this off-by-one issue and make the two error
> > handling consistent, the loop condition of error handling is modified
> > and the 'count++' operation is moved before enetc_map_tx_tso_data().
> >
> > Fixes: fb8629e2cbfc ("net: enetc: add support for software TSO")
> > Cc: stable@xxxxxxxxxxxxxxx
> > Signed-off-by: Wei Fang <wei.fang@xxxxxxx>
> > ---
> >  drivers/net/ethernet/freescale/enetc/enetc.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> > index 9a24d1176479..fe3967268a19 100644
> > --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> > +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> > @@ -832,6 +832,7 @@ static int enetc_map_tx_tso_buffs(struct enetc_bdr
> *tx_ring, struct sk_buff *skb
> >  			txbd = ENETC_TXBD(*tx_ring, i);
> >  			tx_swbd = &tx_ring->tx_swbd[i];
> >  			prefetchw(txbd);
> > +			count++;
> >
> >  			/* Compute the checksum over this segment of data and
> >  			 * add it to the csum already computed (over the L4
> > @@ -848,7 +849,6 @@ static int enetc_map_tx_tso_buffs(struct enetc_bdr
> *tx_ring, struct sk_buff *skb
> >  				goto err_map_data;
> >
> >  			data_len -= size;
> > -			count++;
> >  			bd_data_num++;
> >  			tso_build_data(skb, &tso, size);
> >
> > @@ -874,13 +874,13 @@ static int enetc_map_tx_tso_buffs(struct
> enetc_bdr *tx_ring, struct sk_buff *skb
> >  	dev_err(tx_ring->dev, "DMA map error");
> >
> >  err_chained_bd:
> > -	do {
> > +	while (count--) {
> >  		tx_swbd = &tx_ring->tx_swbd[i];
> >  		enetc_free_tx_frame(tx_ring, tx_swbd);
> >  		if (i == 0)
> >  			i = tx_ring->bd_count;
> >  		i--;
> > -	} while (count--);
> > +	}
> >
> >  	return 0;
> >  }
> 
> ah, there you go, here's the 3rd instance of TX DMA buffer unmapping :-/
> 
> Forget what I said in reply to patch 1/9 about having common code later.
> After going through the whole set and now seeing this, I now think it's
> better that you create the helper now, and consolidate the 2 instances
> you touch anyway. Later you can make enetc_lso_hw_offload() reuse this
> helper in net-next.
> 
> It should be something like this in the end (sorry, just 1 squashed diff):

I agree with you that we finally need a helper function to replace all the
same code blocks, but I'd like to do that for net-next tree, because as I
replied in patch 2, we don't need the count variable. Currently I am more
focused on solving the problem itself rather than optimizing it. Of course
if you think this is necessary, I can add these changes in the next version. :)

> 
> diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c
> b/drivers/net/ethernet/freescale/enetc/enetc.c
> index 6178157611db..a70e92dcbe2c 100644
> --- a/drivers/net/ethernet/freescale/enetc/enetc.c
> +++ b/drivers/net/ethernet/freescale/enetc/enetc.c
> @@ -106,6 +106,24 @@ static void enetc_free_tx_frame(struct enetc_bdr
> *tx_ring,
>  	}
>  }
> 
> +/**
> + * enetc_unwind_tx_frame() - Unwind the DMA mappings of a multi-buffer TX
> frame
> + * @tx_ring: Pointer to the TX ring on which the buffer descriptors are located
> + * @count: Number of TX buffer descriptors which need to be unmapped
> + * @i: Index of the last successfully mapped TX buffer descriptor
> + */
> +static void enetc_unwind_tx_frame(struct enetc_bdr *tx_ring, int count, int i)
> +{
> +	while (count--) {
> +		struct enetc_tx_swbd *tx_swbd = &tx_ring->tx_swbd[i];
> +
> +		enetc_free_tx_frame(tx_ring, tx_swbd);
> +		if (i == 0)
> +			i = tx_ring->bd_count;
> +		i--;
> +	}
> +}
> +
>  /* Let H/W know BD ring has been updated */
>  static void enetc_update_tx_ring_tail(struct enetc_bdr *tx_ring)
>  {
> @@ -399,13 +417,7 @@ static int enetc_map_tx_buffs(struct enetc_bdr
> *tx_ring, struct sk_buff *skb)
>  dma_err:
>  	dev_err(tx_ring->dev, "DMA map error");
> 
> -	while (count--) {
> -		tx_swbd = &tx_ring->tx_swbd[i];
> -		enetc_free_tx_frame(tx_ring, tx_swbd);
> -		if (i == 0)
> -			i = tx_ring->bd_count;
> -		i--;
> -	}
> +	enetc_unwind_tx_frame(tx_ring, count, i);
> 
>  	return 0;
>  }
> @@ -752,7 +764,6 @@ static int enetc_lso_map_data(struct enetc_bdr
> *tx_ring, struct sk_buff *skb,
> 
>  static int enetc_lso_hw_offload(struct enetc_bdr *tx_ring, struct sk_buff *skb)
>  {
> -	struct enetc_tx_swbd *tx_swbd;
>  	struct enetc_lso_t lso = {0};
>  	int err, i, count = 0;
> 
> @@ -776,13 +787,7 @@ static int enetc_lso_hw_offload(struct enetc_bdr
> *tx_ring, struct sk_buff *skb)
>  	return count;
> 
>  dma_err:
> -	do {
> -		tx_swbd = &tx_ring->tx_swbd[i];
> -		enetc_free_tx_frame(tx_ring, tx_swbd);
> -		if (i == 0)
> -			i = tx_ring->bd_count;
> -		i--;
> -	} while (--count);
> +	enetc_unwind_tx_frame(tx_ring, count, i);
> 
>  	return 0;
>  }
> @@ -877,13 +882,7 @@ static int enetc_map_tx_tso_buffs(struct enetc_bdr
> *tx_ring, struct sk_buff *skb
>  	dev_err(tx_ring->dev, "DMA map error");
> 
>  err_chained_bd:
> -	while (count--) {
> -		tx_swbd = &tx_ring->tx_swbd[i];
> -		enetc_free_tx_frame(tx_ring, tx_swbd);
> -		if (i == 0)
> -			i = tx_ring->bd_count;
> -		i--;
> -	}
> +	enetc_unwind_tx_frame(tx_ring, count, i);
> 
>  	return 0;
>  }
> 
> With the definitions laid out explicitly in a kernel-doc, doesn't the
> rest of the patch look a bit wrong? Why would you increment "count"

Sorry, I don't understand what you mean " With the definitions laid ou
explicitly in a kernel-doc", which kernel-doc?

> before enetc_map_tx_tso_data() succeeds? Why isn't the problem that "i"
> needs to be rolled back on enetc_map_tx_tso_data() failure instead?

The root cause of this issue as you said is that "I" and "count" are not
synchronized. Either moving count++ before enetc_map_tx_tso_data()
or rolling back 'i' after enetc_map_tx_tso_data() fails should solve the
issue. There is no problem with the former, it just loops one more time,
and actually the loop does nothing.





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux