> -----Original Message----- > From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] > Sent: Saturday, January 31, 2015 1:29 AM > To: Dexuan Cui > Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev- > devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; > jasowang@xxxxxxxxxx; KY Srinivasan; Haiyang Zhang > Subject: Re: [PATCH 2/3] hv: vmbus_post_msg: retry the hypercall on > HV_STATUS_INVALID_CONNECTION_ID > > Dexuan Cui <decui@xxxxxxxxxxxxx> writes: > > >> -----Original Message----- > >> From: Vitaly Kuznetsov [mailto:vkuznets@xxxxxxxxxx] > >> Sent: Thursday, January 29, 2015 21:31 PM > >> To: Dexuan Cui > >> Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; driverdev- > >> devel@xxxxxxxxxxxxxxxxxxxxxx; olaf@xxxxxxxxx; apw@xxxxxxxxxxxxx; > >> jasowang@xxxxxxxxxx; KY Srinivasan; Haiyang Zhang > >> Subject: Re: [PATCH 2/3] hv: vmbus_post_msg: retry the hypercall on > >> HV_STATUS_INVALID_CONNECTION_ID > >> > >> Dexuan Cui <decui@xxxxxxxxxxxxx> writes: > >> > >> > I got the hypercall error code on Hyper-V 2008 R2 when keeping running > >> > "rmmod hv_netvsc; modprobe hv_netvsc; rmmod hv_utils; modprobe > hv_utils" > >> > in a Linux guest. > >> > > >> > Without the patch, the driver can occasionally fail to load. > >> > > >> > CC: "K. Y. Srinivasan" <kys@xxxxxxxxxxxxx> > >> > Signed-off-by: Dexuan Cui <decui@xxxxxxxxxxxxx> > >> > --- > >> > arch/x86/include/uapi/asm/hyperv.h | 1 + > >> > drivers/hv/connection.c | 9 +++++++++ > >> > 2 files changed, 10 insertions(+) > >> > > >> > diff --git a/arch/x86/include/uapi/asm/hyperv.h > >> b/arch/x86/include/uapi/asm/hyperv.h > >> > index 90c458e..b9daffb 100644 > >> > --- a/arch/x86/include/uapi/asm/hyperv.h > >> > +++ b/arch/x86/include/uapi/asm/hyperv.h > >> > @@ -225,6 +225,7 @@ > >> > #define HV_STATUS_INVALID_HYPERCALL_CODE 2 > >> > #define HV_STATUS_INVALID_HYPERCALL_INPUT 3 > >> > #define HV_STATUS_INVALID_ALIGNMENT 4 > >> > +#define HV_STATUS_INVALID_CONNECTION_ID 18 > >> > #define HV_STATUS_INSUFFICIENT_BUFFERS 19 > >> > >> The gap beween 4 and 18 tells me there are other codes here ;-) Are they > >> all 'permanent failures'? > > It looks we only need to care about these error codes here. > > > > BTW, you can get all the hypercall error codes in the top level functional spec: > > http://blogs.msdn.com/b/virtual_pc_guy/archive/2014/02/17/updated- > hypervisor-top-level-functional-specification.aspx > > For this hypercall (0x005c), see "14.9.7 HvPostMessage". > > Thanks, interesting! > > Btw, HV_STATUS_INSUFFICIENT_MEMORY looks suspicious, looks like we can > hit it as well... > > I suggest we split all failures here in 2 classes: > 1) permanent > 2) worth retrying > > and treat them accordingly (no big changes, just maybe group them within > hv_post_message() together as it is the only place where these codes are > being used). Thanks for the suggestion, Vitaly! I'll send out v2. -- Dexuan > > > >> > > >> > typedef struct _HV_REFERENCE_TSC_PAGE { > >> > diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c > >> > index c4acd1c..8bd05f3 100644 > >> > --- a/drivers/hv/connection.c > >> > +++ b/drivers/hv/connection.c > >> > @@ -440,6 +440,15 @@ int vmbus_post_msg(void *buffer, size_t buflen) > >> > ret = hv_post_message(conn_id, 1, buffer, buflen); > >> > > >> > switch (ret) { > >> > + case HV_STATUS_INVALID_CONNECTION_ID: > >> > + /* > >> > + * We could get this if we send messages too > >> > + * frequently or the host is under low resource > >> > + * conditions: let's wait 1 more second before > >> > + * retrying the hypercall. > >> > + */ > >> > + msleep(1000); > >> > + break; > >> > >> In case it is our last try (No. 10) we will return '18' from the > >> function. I suggest we set ret = -ENOMEM here as well. > > Thanks for the suggestion! > > > > I think it would be better to add this to the case > > HV_STATUS_INVALID_CONNECTION_ID: > > ret = -EAGAIN; > > ? > > Yes, like fallthrough > > > > >> > case HV_STATUS_INSUFFICIENT_BUFFERS: > >> > ret = -ENOMEM; > >> > >> Or should we treat these two equally? There is a smaller (100ms) sleep > >> between tries already, we can consider changing it instead. > >> > >> > case -ENOMEM: > >> > >> -- > >> Vitaly > > In my experiments, in the HV_STATUS_INVALID_CONNECTION_ID case, > > waiting 100ms is not enough sometimes, so I'd like to wait more time. > > I agree with you both cases can wait 1000ms. I'll update my patch. > > > > BTW, the " case -ENOMEM:" is not reachable(the hypervisor itself doesn't > > return -ENOMEM), I think. I can remove it. > > hv_post_message() can return -EMSGSIZE or do_hypercall() return value > (which becomes u16 in hv_post_message()). So yes, I agree, -ENOMEM is > not possible. > > > > > Thanks, > > -- Dexuan > > -- > Vitaly _______________________________________________ devel mailing list devel@xxxxxxxxxxxxxxxxxxxxxx http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel