Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support

Paolo Bonzini <pbonzini@xxxxxxxxxx> · Fri, 24 Oct 2014 07:55:10 +0200

On 10/24/2014 03:27 AM, Chao Peng wrote:
> On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote:
>> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote:
>> [...]
>>> @@ -707,6 +714,24 @@ typedef union {
>>>  } XMMReg;
>>>  
>>>  typedef union {
>>> +    uint8_t _b[32];
>>> +    uint16_t _w[16];
>>> +    uint32_t _l[8];
>>> +    uint64_t _q[4];
>>> +    float32 _s[8];
>>> +    float64 _d[4];
>>> +} YMMReg;
>>> +
>>> +typedef union {
>>> +    uint8_t _b[64];
>>> +    uint16_t _w[32];
>>> +    uint32_t _l[16];
>>> +    uint64_t _q[8];
>>> +    float32 _s[16];
>>> +    float64 _d[8];
>>> +} ZMMReg;
>>> +
>>> +typedef union {
>>>      uint8_t _b[8];
>>>      uint16_t _w[4];
>>>      uint32_t _l[2];
>>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg {
>>>  } BNDCSReg;
>>>  
>>>  #ifdef HOST_WORDS_BIGENDIAN
>>> +#define ZMM_B(n) _b[63 - (n)]
>>> +#define ZMM_W(n) _w[31 - (n)]
>>> +#define ZMM_L(n) _l[15 - (n)]
>>> +#define ZMM_S(n) _s[15 - (n)]
>>> +#define ZMM_Q(n) _q[7 - (n)]
>>> +#define ZMM_D(n) _d[7 - (n)]
>>> +
>>> +#define YMM_B(n) _b[31 - (n)]
>>> +#define YMM_W(n) _w[15 - (n)]
>>> +#define YMM_L(n) _l[7 - (n)]
>>> +#define YMM_S(n) _s[7 - (n)]
>>> +#define YMM_Q(n) _q[3 - (n)]
>>> +#define YMM_D(n) _d[3 - (n)]
>>> +
>>>  #define XMM_B(n) _b[15 - (n)]
>>>  #define XMM_W(n) _w[7 - (n)]
>>>  #define XMM_L(n) _l[3 - (n)]
>>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg {
>>>  #define MMX_L(n) _l[1 - (n)]
>>>  #define MMX_S(n) _s[1 - (n)]
>>>  #else
>>> +#define ZMM_B(n) _b[n]
>>> +#define ZMM_W(n) _w[n]
>>> +#define ZMM_L(n) _l[n]
>>> +#define ZMM_S(n) _s[n]
>>> +#define ZMM_Q(n) _q[n]
>>> +#define ZMM_D(n) _d[n]
>>> +
>>> +#define YMM_B(n) _b[n]
>>> +#define YMM_W(n) _w[n]
>>> +#define YMM_L(n) _l[n]
>>> +#define YMM_S(n) _s[n]
>>> +#define YMM_Q(n) _q[n]
>>> +#define YMM_D(n) _d[n]
>>> +
>>
>> I am probably not being able to see some future use case of those data
>> structures, but: why all the extra complexity here, if only ZMM_Q and
>> YMM_Q are being used in the code, and the only place affected by the
>> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on
>> kvm_{put,get}_xsave(), where the data always have the same layout?
>>
> 
> Thanks Eduardo, then I feel comfortable to drop most of these macros and
> only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I
> will also make these two endiness-insensitive.

I think we can keep the macros.  The actual cleanup would be to have a
single member for the 32 512-bit ZMM registers, instead of splitting
xmm/ymmh/zmmh/zmm_hi16.  This will get rid of the YMM_* and ZMM_*
registers.  However, we could not use simple memcpy()s to marshal in and
out of the XSAVE data.  We can do it in 2.2.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html