Re: [PATCH v2 1/6] x86/kernel/hyper-v: xmm fast hypercall as guest

Roman Kagan <rkagan@xxxxxxxxxxxxx> · Tue, 30 Oct 2018 06:11:10 +0000

On Mon, Oct 29, 2018 at 07:33:43PM -0700, Isaku Yamahata wrote:
> On Mon, Oct 29, 2018 at 06:54:50PM +0000,
> Roman Kagan <rkagan@xxxxxxxxxxxxx> wrote:
> > On Wed, Oct 24, 2018 at 09:48:26PM -0700, Isaku Yamahata wrote:
> > > +/* ibytes = fixed header size + var header size + data size in bytes */
> > > +static inline u64 hv_do_xmm_fast_hypercall(
> > > +	u32 varhead_code, void *input, size_t ibytes,
> > > +	void *output, size_t obytes)
> > > +{
> > > +	u64 control = (u64)varhead_code | HV_HYPERCALL_FAST_BIT;
> > > +	u64 hv_status;
> > > +	u64 input1;
> > > +	u64 input2;
> > > +	size_t i_end = roundup(ibytes, 16);
> > > +	size_t o_end = i_end + roundup(obytes, 16);
> > > +	u64 *ixmm = (u64 *)input + 2;
> > > +	u64 tmp[(o_end - 16) / 8] __aligned((16));
> > > +
> > > +	BUG_ON(i_end <= 16);
> > > +	BUG_ON(o_end > HV_XMM_BYTE_MAX);
> > > +	BUG_ON(!IS_ALIGNED((unsigned long)input, 16));
> > > +	BUG_ON(!IS_ALIGNED((unsigned long)output, 16));
> > > +
> > > +	/* it's assumed that there are at least two inputs */
> > > +	input1 = ((u64 *)input)[0];
> > > +	input2 = ((u64 *)input)[1];
> > > +
> > > +	preempt_disable();
> > 
> > Don't you rather need kernel_fpu_begin() here (paired with
> > kernel_fpu_end() at the end)?  This may affect your benchmark results
> > noticably.
> 
> You're right. For that reason, it's intentional to NOT use
> kernel_fpu_begin/end() for that reason. I'll add a comment on it.

How do you make sure you don't clobber task's fpu state then?

> > > -	res = hv_do_hypercall(HVCALL_RETARGET_INTERRUPT | (var_size << 17),
> > > -			      params, NULL);
> > > +	res = hv_do_hypercall(
> > > +		HVCALL_RETARGET_INTERRUPT | (var_size << 17),
> > > +		params, sizeof(*params) + var_size * 8, NULL, 0);
> > 
> > This probably isn't performance-critical and can be left as is.
> > (Frankly I'm struggling to understand why this has to be a hypercall at
> > all.)
> 
> If interrupt is pending, this hpyercall gives VMM a chance to inject
> interrupt into VM.

Why wouldn't a regular VMBus message give VMM that chance?  Anyway
interrupt rebalancing is not an operation you do frequently so I don't
see why bother optimizing it.

Roman.