Re: [Qemu-devel] [PATCH] target-i386: add Intel AVX-512 support

Eduardo Habkost <ehabkost@xxxxxxxxxx> · Fri, 24 Oct 2014 09:12:38 -0200

On Fri, Oct 24, 2014 at 07:55:10AM +0200, Paolo Bonzini wrote:
> 
> 
> On 10/24/2014 03:27 AM, Chao Peng wrote:
> > On Thu, Oct 23, 2014 at 05:49:23PM -0200, Eduardo Habkost wrote:
> >> On Thu, Oct 23, 2014 at 11:02:43AM +0800, Chao Peng wrote:
> >> [...]
> >>> @@ -707,6 +714,24 @@ typedef union {
> >>>  } XMMReg;
> >>>  
> >>>  typedef union {
> >>> +    uint8_t _b[32];
> >>> +    uint16_t _w[16];
> >>> +    uint32_t _l[8];
> >>> +    uint64_t _q[4];
> >>> +    float32 _s[8];
> >>> +    float64 _d[4];
> >>> +} YMMReg;
> >>> +
> >>> +typedef union {
> >>> +    uint8_t _b[64];
> >>> +    uint16_t _w[32];
> >>> +    uint32_t _l[16];
> >>> +    uint64_t _q[8];
> >>> +    float32 _s[16];
> >>> +    float64 _d[8];
> >>> +} ZMMReg;
> >>> +
> >>> +typedef union {
> >>>      uint8_t _b[8];
> >>>      uint16_t _w[4];
> >>>      uint32_t _l[2];
> >>> @@ -725,6 +750,20 @@ typedef struct BNDCSReg {
> >>>  } BNDCSReg;
> >>>  
> >>>  #ifdef HOST_WORDS_BIGENDIAN
> >>> +#define ZMM_B(n) _b[63 - (n)]
> >>> +#define ZMM_W(n) _w[31 - (n)]
> >>> +#define ZMM_L(n) _l[15 - (n)]
> >>> +#define ZMM_S(n) _s[15 - (n)]
> >>> +#define ZMM_Q(n) _q[7 - (n)]
> >>> +#define ZMM_D(n) _d[7 - (n)]
> >>> +
> >>> +#define YMM_B(n) _b[31 - (n)]
> >>> +#define YMM_W(n) _w[15 - (n)]
> >>> +#define YMM_L(n) _l[7 - (n)]
> >>> +#define YMM_S(n) _s[7 - (n)]
> >>> +#define YMM_Q(n) _q[3 - (n)]
> >>> +#define YMM_D(n) _d[3 - (n)]
> >>> +
> >>>  #define XMM_B(n) _b[15 - (n)]
> >>>  #define XMM_W(n) _w[7 - (n)]
> >>>  #define XMM_L(n) _l[3 - (n)]
> >>> @@ -737,6 +776,20 @@ typedef struct BNDCSReg {
> >>>  #define MMX_L(n) _l[1 - (n)]
> >>>  #define MMX_S(n) _s[1 - (n)]
> >>>  #else
> >>> +#define ZMM_B(n) _b[n]
> >>> +#define ZMM_W(n) _w[n]
> >>> +#define ZMM_L(n) _l[n]
> >>> +#define ZMM_S(n) _s[n]
> >>> +#define ZMM_Q(n) _q[n]
> >>> +#define ZMM_D(n) _d[n]
> >>> +
> >>> +#define YMM_B(n) _b[n]
> >>> +#define YMM_W(n) _w[n]
> >>> +#define YMM_L(n) _l[n]
> >>> +#define YMM_S(n) _s[n]
> >>> +#define YMM_Q(n) _q[n]
> >>> +#define YMM_D(n) _d[n]
> >>> +
> >>
> >> I am probably not being able to see some future use case of those data
> >> structures, but: why all the extra complexity here, if only ZMM_Q and
> >> YMM_Q are being used in the code, and the only place affected by the
> >> ordering of YMMReg and ZMMReg array elements are the memcpy() calls on
> >> kvm_{put,get}_xsave(), where the data always have the same layout?
> >>
> > 
> > Thanks Eduardo, then I feel comfortable to drop most of these macros and
> > only keep YMM_Q/ZMM_Q left. As no acutal benefit for ordering, then I
> > will also make these two endiness-insensitive.
> 
> I think we can keep the macros.  The actual cleanup would be to have a
> single member for the 32 512-bit ZMM registers, instead of splitting
> xmm/ymmh/zmmh/zmm_hi16.  This will get rid of the YMM_* and ZMM_*
> registers.  However, we could not use simple memcpy()s to marshal in and
> out of the XSAVE data.

Agreed. I don't mind keeping those macros in this patch, as this is just
following the existing conventions in the code. Whatever we do to clean
that up, we can do it later.

> We can do it in 2.2.

You mean 2.3, or do you want to clean that up in this release?

-- 
Eduardo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html