Hi Marc, On 5/1/21 4:30 AM, Marc Zyngier wrote: >> I think Device GRE has some practical problems. >> 1. A lot of userspace code which is used to getting write combined >> mappings to GPU memory from kernel drivers does memcpy/memset on it >> which can insert ldp/stp which can crash on Device Memory Type. From >> a quick search I didn't find a memcpy_io or memset_io in >> glibc. Perhaps there are some other functions available, but a lot >> of userspace applications that work on x86 and ARM baremetal won't >> work on ARM VMs without such changes. Changes to all of userspace >> may not always be practical, specially if linking to binaries > This seems to go against what Alex was hinting at earlier, which is > that unaligned accesses were not expected on prefetchable regions, and > Shanker latter confirming that it was an actual bug. Where do we stand > here? > We agreed to call it a driver bug if it's not following Linux write-combining API ioremap_wc() semantics. So far I didn't find whether unaligned accesses allowed or not for WC regions explicitly in Linux documentation. Page faults due to driver unaligned accesses in kernel space will be under driver control, we'll fix it. Driver uses the architecture agnostic functions that are available in the Linux kernel and expecting the same behavior in VM vs Baremetal. We would like to keep the driver implementation is architecture-independent as much as possible and support VM unaware. For ARM64, VM's ioremap_wc() definition doesn't match baremetal. We don't have any control over the userspace applications/drivers/libraries as Vikram saying. Another example GCC memset() function uses 'DC ZVA' which triggers an alignment fault if the actual memory type is device_xxx.