With the introduction of movdir64b instruction, there is now an instruction that can write 64 bytes of data atomicaly. Quoting from Intel SDM: "There is no atomicity guarantee provided for the 64-byte load operation from source address, and processor implementations may use multiple load operations to read the 64-bytes. The 64-byte direct-store issued by MOVDIR64B guarantees 64-byte write-completion atomicity. This means that the data arrives at the destination in a single undivided 64-byte write transaction." We have identified at least 3 different use cases for this instruction in the format of func(dst, src, count): 1) Clear poison / Initialize MKTME memory Destination is normal memory. Source in normal memory. Does not increment. (Copy same line to all targets) Count (to clear/init multiple lines) 2) Submit command(s) to new devices Destination is a special MMIO region for a device. Does not increment. Source is normal memory. Increments. Count usually is 1, but can be multiple. 3) Copy to iomem in big chunks Destination is iomem and increments Source in normal memory and increments Count is number of chunks to copy This commit adds support for case #2 to support device that will accept commands via this instruction. Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx> --- arch/x86/include/asm/io.h | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 9997521fc5cd..2d3c9dd39479 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -399,4 +399,46 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset, extern bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size); +static inline void __iowrite512(void __iomem *__dst, const void *src) +{ + /* + * Note that this isn't an "on-stack copy", just definition of "dst" + * as a pointer to 64-bytes of stuff that is going to be overwritten. + * In the movdir64b() case that may be needed as you can use the + * MOVDIR64B instruction to copy arbitrary memory around. This trick + * lets the compiler know how much gets clobbered. + */ + volatile struct { char _[64]; } *dst = __dst; + + /* movdir64b [rdx], rax */ + asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : "=m" (dst) + : "d" (src), "a" (dst)); +} + +/** + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units + * @dst: destination, in MMIO space (must be 512-bit aligned) + * @src: source + * @count: number of 512 bits quantities to submit + * + * Submit data from kernel space to MMIO space, in units of 512 bits at a + * time. Order of access is not guaranteed, nor is a memory barrier + * performed afterwards. + * + * Warning: Do not use this helper unless your driver has checked that the CPU + * instruction is supported on the platform. + */ +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, + size_t count) +{ + const u8 *from = src; + const u8 *end = from + count * 64; + + while (from < end) { + __iowrite512(dst, from); + from += 64; + } +} + #endif /* _ASM_X86_IO_H */