Hi On Fri, Aug 23, 2024 at 8:40 AM 胡连勤 <hulianqin@xxxxxxxx> wrote: > > Hello linux community expert: > > >> Fixes: c1dca562be8a ("usb gadget: split out serial core") > >> Cc: stable@xxxxxxxxxxxxxxx > >> Signed-off-by: Lianqin Hu <hulianqin@xxxxxxxx> > >> --- > >> v6: > >> - Update the commit text > >> - Add the Fixes tag > >> - CC stable kernel > >> - Add serial_port_lock protection when checking port pointer > >> - Optimize code comments > >> - Delete log printing > > >You need to list ALL of the versions here, I seem to have missed v4 and > >v5 somewhere so I don't know what changed there. > > V4: Add cc stable kernel >> Cc: stable@xxxxxxxxxxxxxxx > V5: Add the Fixes tag >> Fixes: c1dca562be8a ("usb gadget: split out serial core") > >You can also add the Fixes tag and CC stable kernel, so that it can be > >backported to older kernels (such as 5.15) also. > --------- The above two lines are from Prashanth K's comment > > >> static void gs_read_complete(struct usb_ep *ep, struct usb_request > >> *req) { > >> - struct gs_port *port = ep->driver_data; > >> + struct gs_port *port; > >> + unsigned long flags; > >> + > >> + spin_lock_irqsave(&serial_port_lock, flags); > >> + port = ep->driver_data; > >> + > >> + /* When port is NULL, return to avoid panic. */ > > >This comment is not needed, it's obvious that you check before dereference. > OK, I will delete this comment in the new patch. > > >BUT you can mention that you are trying to check with the race somewhere else, right? Please do that, and document here where that race is at that you are doing this extra locking for. > I don't fully understand what you mean. Are you asking which logic is in competition with this one, causing this port to be null? > > Considering that in some extreme cases, when the unbind operation > being executed, gserial_disconnect has already cleared gser->ioport, > and the controller has not stopped & pullup 0, sys.usb.config is reset Here few people know what sys.usb.config doing, you should describe properly what is doing. What I can imagine that you unbind and bind to a new gadget changing the sys.usb.config. Is that right? > and the bind operation will be re-executed, calling gs_read_complete, > which will result in accessing gser->iport, resulting in a null pointer > dereference, add a null pointer check to prevent this situation. My only question why unbind should not wait for pending urb to be completed, before getting in the race? > > >> + if (!port) { > >> + spin_unlock_irqrestore(&serial_port_lock, flags); > >> + return; > >> + } > >> > >> - /* Queue all received data until the tty layer is ready for it. */ > >> spin_lock(&port->port_lock); > >> + spin_unlock(&serial_port_lock); > > >nested spinlocks, why? Did you run this with lockdep enabled to verify you aren't hitting a different bug now? > Because there is a competition relationship between this function and the gserial_disconnect function, > the gserial_disconnect function first obtains serial_port_lock and then obtains port->port_lock. > The purpose of nesting is to ensure that when gs_read_complete is executed, it can be successfully executed after obtaining serial_port_lock. > gserial_disconnect(..) > { > struct gs_port *port = gser->ioport; > ... > spin_lock_irqsave(&serial_port_lock, flags); > spin_lock(&port->port_lock); > ... > gser->ioport = NULL; ---> port = NULL; > ... > spin_unlock(&port->port_lock); > spin_unlock_irqrestore(&serial_port_lock, flags); > } > > After enabling the lockdep function (CONFIG_DEBUG_LOCK_ALLOC=y), there is no lockdep-related warning information. > > >And why is one irqsave and one not? That feels odd, it might be right, but you need to document here why the difference. > After the gs_read_complete function is executed, spin_unlock_irqrestore is used to restore the previous state, 胡连勤 this is not a common locking pattern that is the reason that should be properly described. > - /* Queue all received data until the tty layer is ready for it. */ > spin_lock(&port->port_lock); > + spin_unlock(&serial_port_lock); > + > + /* Queue all received data until the tty layer is ready for it. */ > list_add_tail(&req->list, &port->read_queue); > schedule_delayed_work(&port->push, 0); > - spin_unlock(&port->port_lock); > + spin_unlock_irqrestore(&port->port_lock, flags); ---> Here we use spin_unlock_irqrestore to restore the state > } > > Thanks Thank you