From: Dave Jiang > Sent: 24 September 2020 19:01 > > Currently, the MOVDIR64B instruction is used to atomically > submit 64-byte work descriptors to devices. Although it can > encounter errors like device queue full, command not accepted, > device not ready, etc when writing to a device MMIO, MOVDIR64B > can not report back on errors from the device itself. This > means that MOVDIR64B users need to separately interact with a > device to see if a descriptor was successfully queued, which > slows down device interactions. > > ENQCMD and ENQCMDS also atomically submit 64-byte work > descriptors to devices. But, they *can* report back errors > directly from the device, such as if the device was busy, > or device not enabled or does not support the command. This > immediate feedback from the submission instruction itself > reduces the number of interactions with the device and can > greatly increase efficiency. > > ENQCMD can be used at any privilege level, but can effectively > only submit work on behalf of the current process. ENQCMDS is a > ring0-only instruction and can explicitly specify a process > context instead of being tied to the current process or needing > to reprogram the IA32_PASID MSR. > > Use ENQCMDS for work submission within the kernel because a > Process Address ID (PASID) is setup to translate the kernel > virtual address space. This PASID is provided to ENQCMDS from > the descriptor structure submitted to the device and not retrieved > from IA32_PASID MSR, which is setup for the current user address space. > > See Intel Software Developer’s Manual for more information on the > instructions. > > Signed-off-by: Dave Jiang <dave.jiang@xxxxxxxxx> > Reviewed-by: Tony Luck <tony.luck@xxxxxxxxx> > --- > arch/x86/include/asm/special_insns.h | 34 ++++++++++++++++++++++++++++ > 1 file changed, 34 insertions(+) > > diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h > index 2258c7d6e281..b4d2ce300c94 100644 > --- a/arch/x86/include/asm/special_insns.h > +++ b/arch/x86/include/asm/special_insns.h > @@ -256,6 +256,40 @@ static inline void movdir64b(void *dst, const void *src) > : "m" (*__src), "a" (__dst), "d" (__src)); > } > > +/** > + * enqcmds - copy a 512 bits data unit to single MMIO location > + * @dst: destination, in MMIO space (must be 512-bit aligned) > + * @src: source > + * > + * The ENQCMDS instruction allows software to write a 512 bits command to > + * a 512 bits aligned special MMIO region that supports the instruction. > + * A return status is loaded into the ZF flag in the RFLAGS register. > + * ZF = 0 equates to success, and ZF = 1 indicates retry or error. > + * > + * The enqcmds() function uses the ENQCMDS instruction to submit data from > + * kernel space to MMIO space, in a unit of 512 bits. Order of data access > + * is not guaranteed, nor is a memory barrier performed afterwards. The > + * function returns 0 on success and -EAGAIN on failure. > + * > + * Warning: Do not use this helper unless your driver has checked that the CPU > + * instruction is supported on the platform and the device accepts ENQCMDS. > + */ > +static inline int enqcmds(void __iomem *dst, const void *src) > +{ > + int zf; > + > + /* ENQCMDS [rdx], rax */ > + asm volatile(".byte 0xf3, 0x0f, 0x38, 0xf8, 0x02, 0x66, 0x90" > + CC_SET(z) > + : CC_OUT(z) (zf) > + : "a" (dst), "d" (src)); > + /* Submission failure is indicated via EFLAGS.ZF=1 */ > + if (zf) > + return -EAGAIN; > + > + return 0; > +} > + Doesn't this need an "m" input constraint for the source buffer. Otherwise if it is a local on-stack buffer the compiler will optimise away the instructions that write to it. The missing output memory constraint is less of a problem. The driver needs to be using barriers of its own. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)