Re: "usbip: Implement SG support to vhci-hcd and stub driver" causes a deadlock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 09, 2019 at 03:35:43PM +0900, Suwan Kim wrote:
> On Mon, Dec 09, 2019 at 04:37:40AM +0100, Marek Marczykowski-Górecki wrote:
> > On Mon, Dec 09, 2019 at 11:01:30AM +0900, Suwan Kim wrote:
> > > On Fri, Dec 06, 2019 at 09:57:42PM +0100, Marek Marczykowski-Górecki wrote:
> > > > [  212.890519] usb 1-1: recv xbuf, 42
> > > 
> > > This message is printed by receive error and before that, driver
> > > canceled URB transmission. we need to know the exact situation
> > > before this message.
> > 
> > I've added some more messages and found recv_size is 0.
> 
> That is the bug point. "size" is urb->actual_length that means
> amount of data actually received from device. And "copy" is
> amount of data received from usbip server. So, in this situation,
> vhci-hcd received all the data from usbip server even if there
> are more sg entries left. So, "copy == 0" means vhci-hcd receives
> all data from the server and we should check "if (copy == 0)" in
> for_each_sg() loop of usbip_recv_xbuff() to exit the loop and not
> to add error event.

That makes sense. But I think there is also another issue here: hang in
case of an error. Here it was EINVAL, but there are probably other
reasons why usbip_recv can fail, like network error or misbehaving
server. This definitely should not cause the client to fail this way...
And also, the actual error code is lost.

> > > Could you send me a longer log messages showing the situation
> > > before "[  212.890519] usb 1-1: recv xbuf, 42"?
> > 
> > Sure, with added extra messages (debug patch below).
> > 
> > [  131.397522] usb 1-1: num_sgs 0
> > [  131.406588] usb 1-1: num_sgs 0
> > [  131.410621] usb 1-1: num_sgs 0
> > [  131.411950] usb 1-1: num_sgs 0
> > [  131.413186] usb 1-1: num_sgs 0
> > [  131.414590] usb 1-1: num_sgs 0
> > [  131.417086] usb 1-1: num_sgs 0
> > [  131.418188] usb 1-1: num_sgs 0
> > [  131.419228] usb 1-1: num_sgs 0
> > [  131.420248] usb 1-1: num_sgs 0
> > [  131.457315] usb 1-1: num_sgs 5
> > [  131.457345] usb 1-1: size 42, copy 42 recv 42, recv_size 42, sg->length 16384
> 
> Device sent 42 bytes data (size 42) and vhci-hcd received 42 bytes
> data from the server. vhci-hcd received all the data and It should
> exit the loop.
> 
> > [  131.457359] usb 1-1: size 42, copy 0 recv -22, recv_size 0, sg->length 16384
> > [  131.457372] usb 1-1: recv xbuf, 42 size 42
> > [  131.458263] vhci_hcd: vhci_shutdown_connection:1024: stop threads
> > [  131.458318] vhci_hcd: vhci_shutdown_connection:1032: release socket
> > [  131.458431] vhci_hcd: vhci_shutdown_connection:1058: disconnect device
> > [  131.460171] usb 1-1: USB disconnect, device number 2
> > 
> > (...)
> > 
> > If I add "if (!recv_size) continue;" there, it works!
> 
> I think we should check "copy" not the "recv_size" because "copy"
> shows the amount of data received from the server.
> 
> int usbip_recv_xbuff(struct usbip_device *ud, struct urb *urb)
> ...
> ...
> 	if (urb->num_sgs) {
> 		copy = size;
> 		for_each_sg(urb->sg, sg, urb->num_sgs, i) {
> 			int recv_size;
> 
> 			if (copy < sg->length)
> 				recv_size = copy;
> 			else
> 				recv_size = sg->length;
> 
> 			recv = usbip_recv(ud->tcp_socket, sg_virt(sg),
> 						recv_size);
> 
> 			if (recv != recv_size)
> 				goto error;
> 
> 			copy -= recv;
> 			ret += recv;
> 			
> 			/* Add here */
> 			if (!copy)
> 				break;
> 			^^^^^^^^^^^^^^
> 		}

This helps too.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Media]     [Linux Input]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Old Linux USB Devel Archive]

  Powered by Linux