On Tue, Jan 17, 2023 at 05:18:39AM -0800, Maciej Żenczykowski wrote: > In Google internal bug 265639009 we've received an (as yet) unreproducible > crash report from an aarch64 GKI 5.10.149-android13 running device. > > AFAICT the source code is at: > https://android.googlesource.com/kernel/common/+/refs/tags/ASB-2022-12-05_13-5.10 > > The call stack is: > ncm_close() -> ncm_notify() -> ncm_do_notify() > with the crash at: > ncm_do_notify+0x98/0x270 > Code: 79000d0b b9000a6c f940012a f9400269 (b9405d4b) > > Which I believe disassembles to (I don't know ARM assembly, but it looks sane enough to me...): > > // halfword (16-bit) store presumably to event->wLength (at offset 6 of struct usb_cdc_notification) > 0B 0D 00 79 strh w11, [x8, #6] > > // word (32-bit) store presumably to req->Length (at offset 8 of struct usb_request) > 6C 0A 00 B9 str w12, [x19, #8] > > // x10 (NULL) was read here from offset 0 of valid pointer x9 > // IMHO we're reading 'cdev->gadget' and getting NULL > // gadget is indeed at offset 0 of struct usb_composite_dev > 2A 01 40 F9 ldr x10, [x9] > > // loading req->buf pointer, which is at offset 0 of struct usb_request > 69 02 40 F9 ldr x9, [x19] > > // x10 is null, crash, appears to be attempt to read cdev->gadget->max_speed > 4B 5D 40 B9 ldr w11, [x10, #0x5c] > > which seems to line up with ncm_do_notify() case NCM_NOTIFY_SPEED code fragment: > > event->wLength = cpu_to_le16(8); > req->length = NCM_STATUS_BYTECOUNT; > > /* SPEED_CHANGE data is up/down speeds in bits/sec */ > data = req->buf + sizeof *event; > data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget)); > > My analysis of registers and NULL ptr deref crash offset > (Unable to handle kernel NULL pointer dereference at virtual address 000000000000005c) > heavily suggests that the crash is due to 'cdev->gadget' being NULL when executing: > data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget)); > which calls: > ncm_bitrate(NULL) > which then calls: > gadget_is_superspeed(NULL) > which reads > ((struct usb_gadget *)NULL)->max_speed > and hits a panic. > > AFAICT, if I'm counting right, the offset of max_speed is indeed 0x5C. > (remember there's a GKI KABI reservation of 16 bytes in struct work_struct) > > It's not at all clear to me how this is all supposed to work... > but returning 0 seems much better than panic-ing... > > Cc: Felipe Balbi <balbi@xxxxxxxxxx> > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> > Cc: Lorenzo Colitti <lorenzo@xxxxxxxxxx> > Cc: Carlos Llamas <cmllamas@xxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx > Signed-off-by: Maciej Żenczykowski <maze@xxxxxxxxxx> > --- > drivers/usb/gadget/function/f_ncm.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c > index c36bcfa0e9b4..424bb3b666db 100644 > --- a/drivers/usb/gadget/function/f_ncm.c > +++ b/drivers/usb/gadget/function/f_ncm.c > @@ -83,7 +83,9 @@ static inline struct f_ncm *func_to_ncm(struct usb_function *f) > /* peak (theoretical) bulk transfer rate in bits-per-second */ > static inline unsigned ncm_bitrate(struct usb_gadget *g) > { > - if (gadget_is_superspeed(g) && g->speed >= USB_SPEED_SUPER_PLUS) > + if (!g) > + return 0; > + else if (gadget_is_superspeed(g) && g->speed >= USB_SPEED_SUPER_PLUS) > return 4250000000U; > else if (gadget_is_superspeed(g) && g->speed == USB_SPEED_SUPER) > return 3750000000U; This looks like the wrong place to fix things - if this case is hit, don't we go on to call usb_eq_queue() which can't be valid if the gadget has been destroyed? I don't see how cdev->gadget can be set to null without cdev being freed so is this actually a use-after-free not a simple null-dereference? My guess is that somehow the gadget is being destroyed while the network interface is held open (we've seen similar issues in other, non-network, gadget functions), but I don't know enough about the network side of things to know how to cause that from userspace. John