RE: [PATCH 2/3] hv: vmbus_post_msg: retry the hypercall on HV_STATUS_INVALID_CONNECTION_ID

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
> Sent: Saturday, January 31, 2015 1:29 AM
> To: Dexuan Cui
> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev-
> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx;
> jasowang@xxxxxxxxxx; KY Srinivasan; Haiyang Zhang
> Subject: Re: [PATCH 2/3] hv: vmbus_post_msg: retry the hypercall on
> HV_STATUS_INVALID_CONNECTION_ID
> 
> Dexuan Cui <decui@xxxxxxxxxxxxx> writes:
> 
> >> -----Original Message-----
> >> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx]
> >> Sent: Thursday, January 29, 2015 21:31 PM
> >> To: Dexuan Cui
> >> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev-
> >> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx;
> >> jasowang@xxxxxxxxxx; KY Srinivasan; Haiyang Zhang
> >> Subject: Re: [PATCH 2/3] hv: vmbus_post_msg: retry the hypercall on
> >> HV_STATUS_INVALID_CONNECTION_ID
> >>
> >> Dexuan Cui <decui@xxxxxxxxxxxxx> writes:
> >>
> >> > I got the hypercall error code on Hyper-V 2008 R2 when keeping running
> >> > "rmmod hv_netvsc; modprobe hv_netvsc; rmmod hv_utils; modprobe
> hv_utils"
> >> > in a Linux guest.
> >> >
> >> > Without the patch, the driver can occasionally fail to load.
> >> >
> >> > CC: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx>
> >> > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx>
> >> > ---
> >> >  arch/x86/include/uapi/asm/hyperv.h | 1 +
> >> >  drivers/hv/connection.c            | 9 +++++++++
> >> >  2 files changed, 10 insertions(+)
> >> >
> >> > diff --git a/arch/x86/include/uapi/asm/hyperv.h
> >> b/arch/x86/include/uapi/asm/hyperv.h
> >> > index 90c458e..b9daffb 100644
> >> > --- a/arch/x86/include/uapi/asm/hyperv.h
> >> > +++ b/arch/x86/include/uapi/asm/hyperv.h
> >> > @@ -225,6 +225,7 @@
> >> >  #define HV_STATUS_INVALID_HYPERCALL_CODE	2
> >> >  #define HV_STATUS_INVALID_HYPERCALL_INPUT	3
> >> >  #define HV_STATUS_INVALID_ALIGNMENT		4
> >> > +#define HV_STATUS_INVALID_CONNECTION_ID		18
> >> >  #define HV_STATUS_INSUFFICIENT_BUFFERS		19
> >>
> >> The gap beween 4 and 18 tells me there are other codes here ;-) Are they
> >> all 'permanent failures'?
> > It looks we only need to care about these error codes here.
> >
> > BTW, you can get all the hypercall error codes in the top level functional spec:
> > http://blogs.msdn.com/b/virtual_pc_guy/archive/2014/02/17/updated-
> hypervisor-top-level-functional-specification.aspx
> > For this hypercall (0x005c), see "14.9.7 HvPostMessage".
> 
> Thanks, interesting!
> 
> Btw, HV_STATUS_INSUFFICIENT_MEMORY looks suspicious, looks like we can
> hit it as well...
> 
> I suggest we split all failures here in 2 classes:
> 1) permanent
> 2) worth retrying
> 
> and treat them accordingly (no big changes, just maybe group them within
> hv_post_message() together as it is the only place where these codes are
> being used).
Thanks for the suggestion, Vitaly!
I'll send out  v2.

-- Dexuan

> >
> >> >
> >> >  typedef struct _HV_REFERENCE_TSC_PAGE {
> >> > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> >> > index c4acd1c..8bd05f3 100644
> >> > --- a/drivers/hv/connection.c
> >> > +++ b/drivers/hv/connection.c
> >> > @@ -440,6 +440,15 @@ int vmbus_post_msg(void *buffer, size_t buflen)
> >> >  		ret = hv_post_message(conn_id, 1, buffer, buflen);
> >> >
> >> >  		switch (ret) {
> >> > +		case HV_STATUS_INVALID_CONNECTION_ID:
> >> > +			/*
> >> > +			 * We could get this if we send messages too
> >> > +			 * frequently or the host is under low resource
> >> > +			 * conditions: let's wait 1 more second before
> >> > +			 * retrying the hypercall.
> >> > +			 */
> >> > +			msleep(1000);
> >> > +			break;
> >>
> >> In case it is our last try (No. 10) we will return '18' from the
> >> function. I suggest we set ret = -ENOMEM here as well.
> > Thanks for the suggestion!
> >
> > I think it would be better to add this to the case
> > HV_STATUS_INVALID_CONNECTION_ID:
> >  ret = -EAGAIN;
> > ?
> 
> Yes, like fallthrough
> 
> >
> >> >  		case HV_STATUS_INSUFFICIENT_BUFFERS:
> >> >  			ret = -ENOMEM;
> >>
> >> Or should we treat these two equally? There is a smaller (100ms) sleep
> >> between tries already, we can consider changing it instead.
> >>
> >> >  		case -ENOMEM:
> >>
> >> --
> >>   Vitaly
> > In my experiments, in the HV_STATUS_INVALID_CONNECTION_ID case,
> > waiting 100ms is not enough sometimes, so I'd like to wait more time.
> > I agree with you both cases can wait 1000ms. I'll update my patch.
> >
> > BTW, the " case -ENOMEM:" is not reachable(the hypervisor itself doesn't
> > return -ENOMEM), I think. I can remove it.
> 
> hv_post_message() can return -EMSGSIZE or do_hypercall() return value
> (which becomes u16 in hv_post_message()). So yes, I agree, -ENOMEM is
> not possible.
> 
> >
> > Thanks,
> > -- Dexuan
> 
> --
>   Vitaly
_______________________________________________
devel mailing list
devel@xxxxxxxxxxxxxxxxxxxxxx
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel




[Index of Archives]     [Linux Driver Backports]     [DMA Engine]     [Linux GPIO]     [Linux SPI]     [Video for Linux]     [Linux USB Devel]     [Linux Coverity]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Yosemite Backpacking]
  Powered by Linux