On 18.05.2016 02:01, Francois Romieu wrote: > The smp_wmb() and wmb() could be made side-by-side once *info is > updated but I don't see the adequate idiom to improve the smp_wmb + wmb > combo. :o/ > >> And the wmb() looks like it should be a dma_wmb(). > > I see two points against it: > - it could be too late for skb_tx_timestamp(). > - arc_emac_tx_clean must not see an index update before the device > got a chance to acquire the descriptor. arc_emac_tx_clean can't > tell the difference between an about-to-be-released descriptor > and a returned-from-device one. > Hi, what about the (only compile tested) code below? The smp_wmb() in tx function combined with the smp_rmb() in tx_clean ensures that the CPU running tx_clean sees consistent values for info, data and skb (thus no need to check for validity of all three values any more). The mb() fulfills several tasks: 1. makes sure that DMA writes to descriptor are completed before the HW is informed. 2. On multi processor systems: ensures that txbd_curr is updated (this is paired with the smp_mb() at the end of tx_clean). 3. Ensure we see the most recent value for tx_dirty. With this we do not have to recheck after we stopped the tx queue. --- a/drivers/net/ethernet/arc/emac_main.c +++ b/drivers/net/ethernet/arc/emac_main.c @@ -162,8 +162,13 @@ static void arc_emac_tx_clean(struct net_device *ndev) struct sk_buff *skb = tx_buff->skb; unsigned int info = le32_to_cpu(txbd->info); - if ((info & FOR_EMAC) || !txbd->data || !skb) + if (info & FOR_EMAC) { + /* Make sure we see consistent values for info, skb + * and data. + */ + smp_rmb(); break; + } if (unlikely(info & (DROP | DEFR | LTCL | UFLO))) { stats->tx_errors++; @@ -679,36 +684,33 @@ static int arc_emac_tx(struct sk_buff *skb, struct net_device *ndev) dma_unmap_addr_set(&priv->tx_buff[*txbd_curr], addr, addr); dma_unmap_len_set(&priv->tx_buff[*txbd_curr], len, len); - priv->txbd[*txbd_curr].data = cpu_to_le32(addr); - /* Make sure pointer to data buffer is set */ - wmb(); + priv->txbd[*txbd_curr].data = cpu_to_le32(addr); + priv->tx_buff[*txbd_curr].skb = skb; - skb_tx_timestamp(skb); + /* Make sure info is set after data and skb with respect to + * other tx_clean(). + */ + smp_wmb(); *info = cpu_to_le32(FOR_EMAC | FIRST_OR_LAST_MASK | len); - /* Make sure info word is set */ - wmb(); - - priv->tx_buff[*txbd_curr].skb = skb; - /* Increment index to point to the next BD */ *txbd_curr = (*txbd_curr + 1) % TX_BD_NUM; - /* Ensure that tx_clean() sees the new txbd_curr before + /* 1.Ensure that tx_clean() sees the new txbd_curr before * checking the queue status. This prevents an unneeded wake * of the queue in tx_clean(). + * 2.Ensure that all values are written to RAM and to DMA + * before hardware is informed. + * 3.Ensure we see the most recent value for tx_dirty. */ - smp_mb(); + mb(); - if (!arc_emac_tx_avail(priv)) { + if (!arc_emac_tx_avail(priv)) netif_stop_queue(ndev); - /* Refresh tx_dirty */ - smp_mb(); - if (arc_emac_tx_avail(priv)) - netif_start_queue(ndev); - } + + skb_tx_timestamp(skb); arc_reg_set(priv, R_STATUS, TXPL_MASK); -- 2.7.0 Regards, Lino