On 23.05.2016 00:36, Francois Romieu wrote: > Lino Sanfilippo <LinoSanfilippo at gmx.de> : > [...] >> --- a/drivers/net/ethernet/arc/emac_main.c >> +++ b/drivers/net/ethernet/arc/emac_main.c >> @@ -159,12 +159,22 @@ static void arc_emac_tx_clean(struct net_device *ndev) >> unsigned int *txbd_dirty = &priv->txbd_dirty; >> struct arc_emac_bd *txbd = &priv->txbd[*txbd_dirty]; >> struct buffer_state *tx_buff = &priv->tx_buff[*txbd_dirty]; >> - struct sk_buff *skb = tx_buff->skb; >> unsigned int info = le32_to_cpu(txbd->info); >> + struct sk_buff *skb; >> > > Insert a smp_rmb() here to close one window for an outdated txbd_dirty value > (the "arc_emac_tx_clean wrote txbd_curr and issued smp_wmb" one). > > Actually, insert smp_rmb() at the start of arc_emac_tx_clean() as it > does not need to be performed withing the loop and would penalize it. I agree, we should place the barrier before the loop. I also think it is indeed more appropriate to use the SMP versions for both barriers. > > Given the implicit smp barriers in the non-driver code, I consider > "arc_emac_tx_clean on one CPU does not read latest txbd_dirty value written > by previous arc_emac_tx_clean on different CPU" as utter onanism but > issueing smp_rmb() at the start of arc_emac_tx_clean() nails it as well. > >> - if ((info & FOR_EMAC) || !txbd->data || !skb) >> + if (*txbd_dirty == priv->txbd_curr) >> break; > > Ok, it's just the "while (priv->txbd_dirty != priv->txbd_curr) {" loop > in disguise. I cant deny that :) > >> >> + /* Make sure curr pointer is consistent with info */ >> + rmb(); >> + >> + info = le32_to_cpu(txbd->info); >> + >> + if (info & FOR_EMAC) >> + break; > > With proper ordering + barrier in arc_emac_tx() you can relax it to smp_rmb(). Ok. >> + >> + skb = tx_buff->skb; >> + >> if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) { >> stats->tx_errors++; >> stats->tx_dropped++; >> @@ -195,8 +205,8 @@ static void arc_emac_tx_clean(struct net_device *ndev) >> *txbd_dirty = (*txbd_dirty + 1) % TX_BD_NUM; >> } >> >> - /* Ensure that txbd_dirty is visible to tx() before checking >> - * for queue stopped. >> + /* Ensure that txbd_dirty is visible to tx() and we see the most recent >> + * value for txbd_curr. >> */ >> smp_mb(); >> >> @@ -680,35 +690,29 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev) >> dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len); >> >> priv->txbd[*txbd_curr].data = cpu_to_le32(addr); >> - >> - /* Make sure pointer to data buffer is set */ >> - wmb(); >> + priv->tx_buff[*txbd_curr].skb = skb; >> >> skb_tx_timestamp(skb); >> >> *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len); > > No. > > You need dma_wmb() after skb_tx_timestamp() to commit skb_tx_timestamp() [*] > and data = cpu_to_le32(addr). I dont agree here. A dma_wmb would have an effect to "data" and "info", yes, but it would have absolutely no effect to skb_tx_timestamp(), since there is no dma access involved here. In fact skb_tx_timestamp() could probably be even reordered to appear after the dma_wmb. Anyway, there is the wmb() directly after the assignment to "info". _This_ barrier should ensure that skb_tx_timestamp() (along with a flush of data and info to DMA) is executed before "txbd_curr" is advanced. This means that the corresponding skb cant be freed prematurely by tx_clean(). > > [*] I doubt anyone want a dma_sync_single_...() here. > >> >> - /* Make sure info word is set */ >> + /* 1. Make sure that with respect to tx_clean everything is set up >> + * properly before we advance txbd_curr. >> + * 2. Make sure writes to DMA descriptors are completed before we inform >> + * the hardware. >> + */ >> wmb(); > > Yes, either wmb() or smp_wmb() + dma_wmb(). > I really prefer one generic barrier over combos of 2 or more special barriers. >> >> - priv->tx_buff[*txbd_curr].skb = skb; >> - >> /* Increment index to point to the next BD */ >> *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM; >> >> - /* Ensure that tx_clean() sees the new txbd_curr before >> - * checking the queue status. This prevents an unneeded wake >> - * of the queue in tx_clean(). >> + /* Ensure we see the most recent value of txbd_dirty and tx_clean() sees >> + * the updated value of txbd_curr. >> */ >> smp_mb(); > > Nit: s/the most/a/ > > "a" as in "arc_emac_tx_clean() _is_ racing with arc_emac_tx" > >> >> - if (!arc_emac_tx_avail(priv)) { >> + if (!arc_emac_tx_avail(priv)) >> netif_stop_queue(ndev); >> - /* Refresh tx_dirty */ >> - smp_mb(); >> - if (arc_emac_tx_avail(priv)) >> - netif_start_queue(ndev); >> - } > > No. > > I may sound like an old record but the revalidation part must be kept. > > txbd_dirty may change in the arc_emac_tx_avail.. netif_stop_queue window > (the race requires a different CPU as arc_emac_tx() runs in locally > disabled BH context). > Ok, I can see the race now. So yes, this should be kept. I will prepare a patch with the discussed changes tomorrow. Regards, Lino