On 2021/02/11 12:01, Tetsuo Handa wrote: > On 2021/02/11 10:04, Tetsuo Handa wrote: >> On 2021/02/11 5:15, Shuah Khan wrote: >>> I you would like to fix this, please send me a complete fix. >> >> If you want to handle the unlikely "__kthread_create_on_node() fails without >> being killed" case, such change ( the drivers/usb/usbip/vhci_sysfs.c portion in >> https://syzkaller.appspot.com/x/patch.diff?x=16c3c090d00000 ) should be a separate >> patch. Since this patch declares "Fixes: 9720b4bc76a83807 ("staging/usbip: convert to kthread")", >> this patch intends for the minimal change and does not want to do extra things. >> > > If you want a complete fix, the (untested) diff will become large. I guess that we need to serialize attach operation and reset/detach operations, for it seems there is a race window that might result in "general protection fault in tomoyo_socket_sendmsg_permission". Details follows... > diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c > index be37aec250c2..0d10021c4186 100644 > --- a/drivers/usb/usbip/vhci_sysfs.c > +++ b/drivers/usb/usbip/vhci_sysfs.c > @@ -305,20 +305,22 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr, > { > struct socket *socket; > int sockfd = 0; > __u32 port = 0, pdev_nr = 0, rhport = 0, devid = 0, speed = 0; > struct usb_hcd *hcd; > struct vhci_hcd *vhci_hcd; > struct vhci_device *vdev; > struct vhci *vhci; > int err; > unsigned long flags; > + struct task_struct *tx; > + struct task_struct *rx; > > /* > * @rhport: port number of vhci_hcd > * @sockfd: socket descriptor of an established TCP connection > * @devid: unique device identifier in a remote host > * @speed: usb device speed in a remote host > */ > if (sscanf(buf, "%u %u %u %u", &port, &sockfd, &devid, &speed) != 4) > return -EINVAL; > pdev_nr = port_to_pdev_nr(port); > @@ -345,62 +347,83 @@ static ssize_t attach_store(struct device *dev, struct device_attribute *attr, > if (speed == USB_SPEED_SUPER) > vdev = &vhci->vhci_hcd_ss->vdev[rhport]; > else > vdev = &vhci->vhci_hcd_hs->vdev[rhport]; > > /* Extract socket from fd. */ > socket = sockfd_lookup(sockfd, &err); > if (!socket) > return -EINVAL; > > + /* Create threads now. */ > + rx = kthread_create(vhci_rx_loop, &vdev->ud, "vhci_rx"); > + tx = kthread_create(vhci_tx_loop, &vdev->ud, "vhci_tx"); > + if (IS_ERR(rx) || IS_ERR(tx)) > + goto thread_error; > + > /* now need lock until setting vdev status as used */ > > /* begin a lock */ > spin_lock_irqsave(&vhci->lock, flags); > spin_lock(&vdev->ud.lock); > > if (vdev->ud.status != VDEV_ST_NULL) { > /* end of the lock */ > spin_unlock(&vdev->ud.lock); > spin_unlock_irqrestore(&vhci->lock, flags); > > - sockfd_put(socket); > - > dev_err(dev, "port %d already used\n", rhport); > /* > * Will be retried from userspace > * if there's another free port. > */ > - return -EBUSY; > + err = -EBUSY; > + goto thread_error; > } > > dev_info(dev, "pdev(%u) rhport(%u) sockfd(%d)\n", > pdev_nr, rhport, sockfd); > dev_info(dev, "devid(%u) speed(%u) speed_str(%s)\n", > devid, speed, usb_speed_string(speed)); > > vdev->devid = devid; > vdev->speed = speed; > vdev->ud.sockfd = sockfd; > vdev->ud.tcp_socket = socket; > vdev->ud.status = VDEV_ST_NOTASSIGNED; > > spin_unlock(&vdev->ud.lock); > spin_unlock_irqrestore(&vhci->lock, flags); > /* end the lock */ > > - vdev->ud.tcp_rx = kthread_get_run(vhci_rx_loop, &vdev->ud, "vhci_rx"); > - vdev->ud.tcp_tx = kthread_get_run(vhci_tx_loop, &vdev->ud, "vhci_tx"); > + /* Start the threads. */ > + get_task_struct(rx); > + vdev->ud.tcp_rx = rx; > + wake_up_process(rx); Even this seemingly complete fix turned out to be incomplete. kthread_get_run(vhci_rx_loop, &vdev->ud, "vhci_rx") creates a vhci_rx kernel thread and immediately starts vhci_rx_loop() function. Then, vdev->ud.tcp_rx becomes non-NULL because the vhci_rx kernel thread was successfully created. Then in vhci_rx_loop(), vhci_rx_pdu(ud) will be called because kthread_should_stop() is false and usbip_event_happened() is 0. In vhci_rx_pdu(), sock_recvmsg() is called from usbip_recv(). If the peer side of ud->tcp_socket already called shutdown(SHUT_WR), sock_recvmsg() detects end of stream and returns 0. Then, vhci_rx_pdu() prints "connection closed" message and calls usbip_event_add(ud, VDEV_EVENT_DOWN). (Well, where is the code that verifies that tcp_socket is indeed a SOCK_STREAM socket? If the file descriptor passed was a SOCK_DGRAM socket, sock_recvmsg() can't detect end of stream...) Now, kthread_get_run(vhci_tx_loop, &vdev->ud, "vhci_tx") creates a vhci_tx kernel thread and immediately starts vhci_tx_loop() function. But if current thread which called kthread_get_run() is preempted at this moment, vdev->ud.tcp_tx remains NULL despite vhci_tx kernel thread was successfully created. Then in vhci_tx_loop(), wait_event_interruptible() is called because kthread_should_stop() is false and vhci_send_cmd_submit() is 0 and vhci_send_cmd_unlink() is 0. But the vhci_tx kernel thread will call vhci_send_cmd_submit() again when list_empty(&vdev->priv_tx) becomes false. Now, event_handler() is called by the usbip_event singlethreaded workqueue kernel thread because the vhci_rx kernel thread called queue_work(usbip_queue, &usbip_work) from usbip_event_add(). Since VDEV_EVENT_DOWN is defined as USBIP_EH_SHUTDOWN | USBIP_EH_RESET, firstly ud->eh_ops.shutdown(ud) (which is mapped to vhci_shutdown_connection()) is called and then ud->eh_ops.reset(ud) (which is mapped to vhci_device_reset()) is called. In vhci_shutdown_connection(), it tries to shutdown the vhci_tx kernel thread and the vhci_rx kernel thread before resetting vdev->ud.tcp_socket to NULL. But recall that vdev->ud.tcp_tx can remains NULL due to preemption. As a result, despite the effort to handle USBIP_EH_SHUTDOWN before USBIP_EH_RESET, kthread_stop_put(vdev->ud.tcp_tx) won't be called before vdev->ud.tcp_socket = NULL is called. When the vhci_tx kernel thread found that list_empty(&vdev->priv_tx) became false, it calls vhci_send_cmd_submit() and triggers "general protection fault in tomoyo_socket_sendmsg_permission". Therefore, I think "general protection fault in tomoyo_socket_sendmsg_permission" is a NULL pointer dereference which can happen if vhci_shutdown_connection() is called before vdev->ud.tcp_tx becomes non-NULL due to a race condition and that the easiest way is to serialize attach operation and reset/detach operations using a mutex, for event_handler() which calls reset/detach operations is a schedulable context. > + get_task_struct(tx); > + vdev->ud.tcp_tx = tx; > + wake_up_process(tx); Maybe just grouping assignment of vdev->ud.tcp_socket, vdev->ud.status, vdev->ud.tcp_{rx,tx} within one spinlock-protected section might be fine? But since vhci_port_disconnect() from detach_store() seems to be racy with attach_store() (vhci_port_disconnect() can trigger vhci_shutdown_connection() via usbip_event_add(&vdev->ud, VDEV_EVENT_DOWN) as soon as attach_store() did vdev->ud.status = VDEV_ST_NOTASSIGNED), I suggest using a killable mutex for serialization purpose is simpler and safer. > > rh_port_connect(vdev, speed); > > return count; > +thread_error: > + sockfd_put(socket); > + if (IS_ERR(rx)) > + err = PTR_ERR(rx); > + else > + kthread_stop(rx); > + if (IS_ERR(tx)) > + err = PTR_ERR(tx); > + else > + kthread_stop(tx); > + return err; > } > static DEVICE_ATTR_WO(attach); > > #define MAX_STATUS_NAME 16 > > struct status_attr { > struct device_attribute attr; > char name[MAX_STATUS_NAME+1]; > }; >