On Wed, Jun 26, 2024 at 10:46:18PM +0800, fred lu wrote: > Hi Jason, > Given that x86 architectures provide strong memory ordering guarantees > for loads and stores by default, it seems that the explicit use of > 'lfence' may not be necessary for ensuring memory consistency in many > cases. > So why not remove 'lfence' from the definition of > udma_from_device_barrier() for x86_64, similar to the change made in > udma_to_device_barrier() as seen in patch below? The trouble with these barriers is none of us really know the x86 definition well enough to be certain of any change. At this point lfence is proven to work. Perhaps it would be OK to remove it, perhaps it will mess up PCIe relaxed ordering, or SSE non-temporal, I just don't know. To even motivate someone to look at this there would need to be benchmark results indicating there is a significant gain to be had. Jason