Re: [PATCH v2] cdc-wdm: fix "out-of-sync" due to missing notifications

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[removed the stable CC since this discussion isn't relevant to stable
 anymore]

Oliver Neukum <oneukum@xxxxxxxx> writes:

> On Mon, 2016-07-04 at 19:01 +0200, Bjørn Mork wrote:
>
>> The proposed driver solution is a hack, allowing us to resyncronize
>> without resetting the firmware. We have no reliable way to reset the
>> firmware and doing so will have large side effects anyway.  The hack
>> allows us to continue, with a possibly lost message or two as the only
>> consequences.  We don't have to break any existing session etc.
>
> OK, then I propose it is time to bring out the sledge hammer.
> How about changing usb_cdc_wdm_register() to include a flag
> about the necessity to drain the buffers?

Maybe... But I am hesitating a bit about yet again changing the
usb_cdc_wdm_register() signature.  It is messy, living in the usb tree
while affecting 3 drivers in the netdev tree.  I'm also of the opinion
that those 3 drivers will need this enabled. At least I'm sure both
qmi_wwan and cdc_mbim will.

How about splitting the behaviour locally in cdc-wdm, keeping the
current behaviour for CDC WDM devices and changing to "drain on open"
for the netdev drivers?  Something like this plus the necessary logic
dealing with the "drain_on_open" ("next_desc" is a label in wdm_probe):

diff --git a/drivers/usb/class/cdc-wdm.c b/drivers/usb/class/cdc-wdm.c
index 61ea87917433..fce875bb7bb1 100644
--- a/drivers/usb/class/cdc-wdm.c
+++ b/drivers/usb/class/cdc-wdm.c
@@ -753,7 +753,8 @@ static void wdm_rxwork(struct work_struct *work)
 /* --- hotplug --- */
 
 static int wdm_create(struct usb_interface *intf, struct usb_endpoint_descriptor *ep,
-		u16 bufsize, int (*manage_power)(struct usb_interface *, int))
+		u16 bufsize, int (*manage_power)(struct usb_interface *, int),
+		bool drain_on_open)
 {
 	int rv = -ENOMEM;
 	struct wdm_device *desc;
@@ -913,7 +914,7 @@ next_desc:
 		goto err;
 	ep = &iface->endpoint[0].desc;
 
-	rv = wdm_create(intf, ep, maxcom, &wdm_manage_power);
+	rv = wdm_create(intf, ep, maxcom, &wdm_manage_power, false);
 
 err:
 	return rv;
@@ -945,7 +946,7 @@ struct usb_driver *usb_cdc_wdm_register(struct usb_interface *intf,
 {
 	int rv = -EINVAL;
 
-	rv = wdm_create(intf, ep, bufsize, manage_power);
+	rv = wdm_create(intf, ep, bufsize, manage_power, true);
 	if (rv < 0)
 		goto err;
 


>> But it turns out that it is a real problem for the very same MBIM
>> firmwares which caused us to need the hack in the first place: They
>> stall when you try to read and there is no data.  Which means that a
>> lost race can end up with an EPIPE being returned to _read(), where it
>> is translated and returned to userspace.
>
> This sucks and it tells me even more that CDC MBIM should tell us when
> to apply this work around at startup.

Yes.  We could alternatively filter the EPIPE from read(), since it
isn't supposed to happen anyway.  But it's not going to look less ugly :(

>> This is a mess.  I fully agree that it looks like this either should go
>> into both open() and resume(), or none.  The real answer is 'none'.
>> 
>> But we need the hack for the yucky MBIM firmwares.  Putting it in open()
>> is defining open() as a syncronization point for something which should
>> not need syncronization in a perfect world.  I can expand the comment
>> there a bit to make this clear. Making resume() a syncronization point
>> as well is only making things worse.
>
> I'd say add a flag, let CDC MBIM set it unconditionally (we can switch
> to a black list if we absolutely need to) and include a big, fat
> commentary. There is no good solution for this problem.

Execpt for the extreme ugliness, I don't think it will hurt to apply
this unconditionally for all the network drivers using cdc-wdm, as long
as it is limited to open only.

I certainly want to avoid any blacklist.  Device IDs are cheap in this
market. The MBIM modems typically run Android and the device ID is
configured in NVRAM for whatever OEM laptop vendor it is sold to.
Having a catch-all class driver for MBIM is an absolute requirement.
Making it depend on lists of devices is not an option, IMHO.

On the opposite side we have the huawei NCM modems, where device IDs are
so expensive that they have been using the same small set for thousands
of modems based on various hardware and firmware.


Bjørn
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]