On 10/24/2014 03:27 AM, Chao Peng wrote: > On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote: >> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote: >> [...] >>> @@ -707,6 +714,24 @@ typedef union { >>> } XMMReg; >>> >>> typedef union { >>> + uint8_t _b[32]; >>> + uint16_t _w[16]; >>> + uint32_t _l[8]; >>> + uint64_t _q[4]; >>> + float32 _s[8]; >>> + float64 _d[4]; >>> +} YMMReg; >>> + >>> +typedef union { >>> + uint8_t _b[64]; >>> + uint16_t _w[32]; >>> + uint32_t _l[16]; >>> + uint64_t _q[8]; >>> + float32 _s[16]; >>> + float64 _d[8]; >>> +} ZMMReg; >>> + >>> +typedef union { >>> uint8_t _b[8]; >>> uint16_t _w[4]; >>> uint32_t _l[2]; >>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg { >>> } BNDCSReg; >>> >>> #ifdef HOST_WORDS_BIGENDIAN >>> +#define ZMM_B(n) _b[63 - (n)] >>> +#define ZMM_W(n) _w[31 - (n)] >>> +#define ZMM_L(n) _l[15 - (n)] >>> +#define ZMM_S(n) _s[15 - (n)] >>> +#define ZMM_Q(n) _q[7 - (n)] >>> +#define ZMM_D(n) _d[7 - (n)] >>> + >>> +#define YMM_B(n) _b[31 - (n)] >>> +#define YMM_W(n) _w[15 - (n)] >>> +#define YMM_L(n) _l[7 - (n)] >>> +#define YMM_S(n) _s[7 - (n)] >>> +#define YMM_Q(n) _q[3 - (n)] >>> +#define YMM_D(n) _d[3 - (n)] >>> + >>> #define XMM_B(n) _b[15 - (n)] >>> #define XMM_W(n) _w[7 - (n)] >>> #define XMM_L(n) _l[3 - (n)] >>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg { >>> #define MMX_L(n) _l[1 - (n)] >>> #define MMX_S(n) _s[1 - (n)] >>> #else >>> +#define ZMM_B(n) _b[n] >>> +#define ZMM_W(n) _w[n] >>> +#define ZMM_L(n) _l[n] >>> +#define ZMM_S(n) _s[n] >>> +#define ZMM_Q(n) _q[n] >>> +#define ZMM_D(n) _d[n] >>> + >>> +#define YMM_B(n) _b[n] >>> +#define YMM_W(n) _w[n] >>> +#define YMM_L(n) _l[n] >>> +#define YMM_S(n) _s[n] >>> +#define YMM_Q(n) _q[n] >>> +#define YMM_D(n) _d[n] >>> + >> >> I am probably not being able to see some future use case of those data >> structures, but: why all the extra complexity here, if only ZMM_Q and >> YMM_Q are being used in the code, and the only place affected by the >> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on >> kvm_{put,get}_xsave(), where the data always have the same layout? >> > > Thanks Eduardo, then I feel comfortable to drop most of these macros and > only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I > will also make these two endiness-insensitive. I think we can keep the macros. The actual cleanup would be to have a single member for the 32 512-bit ZMM registers, instead of splitting xmm/ymmh/zmmh/zmm_hi16. This will get rid of the YMM_* and ZMM_* registers. However, we could not use simple memcpy()s to marshal in and out of the XSAVE data. We can do it in 2.2. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html