Pavel Machek <pavel@xxxxxxx> writes: > Hi! > >> Add a couple of memory-barriers to ensure correct ordering of read/write >> access to TX BDs. > > So... this is dealing with CPU-to-device consistency, not CPU-to-CPU, > right? Actually, One of both. When looping over the buffers, looking for CMPLT bit in APP0 (in temac_start_xmit_done()), the challenge is CPU-to-device consistency, as the CMPLT bit is set by device, and read by CPU. But when we clear APP0 (and the other fields) in the same loop, it is CPU-to-CPU, as APP0 is cleared by CPU and read by CPU. > >> +++ b/drivers/net/ethernet/xilinx/ll_temac_main.c >> @@ -774,12 +774,15 @@ static void temac_start_xmit_done(struct net_device *ndev) >> stat = be32_to_cpu(cur_p->app0); >> >> while (stat & STS_CTRL_APP0_CMPLT) { >> + /* Make sure that the other fields are read after bd is >> + * released by dma >> + */ >> + rmb(); >> dma_unmap_single(ndev->dev.parent, > > Full barrier, as expected. > >> @@ -788,6 +791,12 @@ static void temac_start_xmit_done(struct net_device *ndev) >> ndev->stats.tx_packets++; >> ndev->stats.tx_bytes += be32_to_cpu(cur_p->len); >> >> + /* app0 must be visible last, as it is used to flag >> + * availability of the bd >> + */ >> + smp_mb(); > > SMP-only barrier, but full barrier is needed here AFAICT. I don't think that is needed. See above. /Esben