On Wed, Feb 5, 2020 at 5:42 PM Felipe Balbi <balbi@xxxxxxxxxx> wrote: > > > Hi, > > Andrey Konovalov <andreyknvl@xxxxxxxxxx> writes: > >> >> > +static int raw_event_queue_add(struct raw_event_queue *queue, > >> >> > + enum usb_raw_event_type type, size_t length, const void *data) > >> >> > +{ > >> >> > + unsigned long flags; > >> >> > + struct usb_raw_event *event; > >> >> > + > >> >> > + spin_lock_irqsave(&queue->lock, flags); > >> >> > + if (WARN_ON(queue->size >= RAW_EVENT_QUEUE_SIZE)) { > >> >> > + spin_unlock_irqrestore(&queue->lock, flags); > >> >> > + return -ENOMEM; > >> >> > + } > >> >> > + event = kmalloc(sizeof(*event) + length, GFP_ATOMIC); > >> >> > >> >> I would very much prefer dropping GFP_ATOMIC here. Must you have this > >> >> allocation under a spinlock? > >> > > >> > The issue here is not the spinlock, but that this might be called in > >> > interrupt context. The number of atomic allocations here is restricted > >> > by 128, and we can reduce the limit even further (until some point in > >> > the future when and if we'll report more different events). Another > >> > option would be to preallocate the required number of objects > >> > (although we don't know the required size in advance, so we'll waste > >> > some memory) and use those. What would you prefer? > >> > >> I think you shouldn't do either :-) Here's what I think you should do: > >> > >> 1. support O_NONBLOCK. This just means conditionally removing your > >> wait_for_completion_interruptible(). > > > > I don't think non blocking read/writes will work for us. We do > > coverage-guided fuzzing and need to collect coverage for each syscall. > > In the USB case "syscall" means processing a USB request from start to > > end, and thus we need to wait until the kernel finishes processing it, > > otherwise we'll fail to associate coverage properly (It's actually a > > bit more complex, as we collect coverage for the whole initial > > enumeration process as for one "syscall", but the general idea stands, > > that we need to wait until the operation finishes.) > > Fair enough, but if the only use case that this covers, is a testing > scenario, we don't gain much from accepting this upstream, right? We gain a lot, even though it's just for testing. For one thing, once the patch is upstream, all syzbot instances that target upstream-ish branches will start fuzzing USB, and there won't be any need for me to maintain a dedicated USB fuzzing branch manually. Another thing, is that syzbot will be able to do fix/cause bisection (at least for the bugs that are fixed/introduced after this patch is merged). And finally, once this is upstream, we'll be able to backport this to Android kernels and start testing them as well. > We can > still support both block and nonblock, but let's at least give the > option. > > >> 2. Every time user calls write(), you usb_ep_alloc(), allocate a buffer > >> with the write size, copy buffer to kernel space, > >> usb_ep_queue(). When complete() callback is called, then you free the > >> request. This would allow us to amortize the cost of copy_from_user() > >> with several requests being queued to USB controller. > > > > I'm not sure I really get this part. We'll still need to call > > copy_from_user() and usb_ep_queue() once per each operation/request. > > How does it get amortized? Or do you mean that having multiple > > requests queued will allow USB controller to process them in bulk? > > yes :-) > > > This makes sense, but again, we"ll then have an issue with coverage > > association. > > You can still enqueue one by one, but this would turn your raw-gadget > interface more interesting for other use cases. > > >> 3. Have a pre-allocated list of requests (128?) for read(). Enqueue them > >> all during set_alt(). When user calls read() you will: > >> > >> a) check if there are completed requests to be copied over to > >> userspace. Recycle the request. > >> > >> b) if there are no completed requests, then it depends on O_NONBLOCK > >> > >> i) If O_NONBLOCK, return -EWOULDBLOCK > >> ii) otherwise, wait_for_completion > > > > See response to #1, if we prequeue requests, then the kernel will > > start handling them before we do read(), and we'll fail to associate > > coverage properly. (This will also require adding another ioctl to > > imitate set_alt(), like the USB_RAW_IOCTL_CONFIGURE that we have.) > > set_alt() needs to be supported if we're aiming at providing support for > various USB classes to be implemented on top of what you created :-) What do you mean by supporting set_alt() here? AFAIU set_alt() is a part of the composite gadget framework, which I don't use for this. Are there some other actions (besides sending/receiving requests) that need to be exposed to userspace to implement various USB classes? The one that I know about is halting endpoints, it's mentioned in the TODO section in documentation. > > >> I think this can all be done without any GFP_ATOMIC allocations. > > > > Overall, supporting O_NONBLOCK might be a useful feature for people > > who are doing something else other than fuzzing, We can account for > > potential future extensions that'll support it, so detecting > > O_NONBLOCK and returning an error for now makes sense. > > > > WDYT? > > If that's the way you want to go, that's okay. But let's, then, prepare > the code for extension later on. For example, let's add an IOCTL which > returns the "version" of the ABI. Based on that, userspace can detect > features and so on. This sounds good to me. Let's concentrate on implementing the part that is essential for testing/fuzzing, as it was the initial reason why I started working on this, instead of using e.g. GadgetFS. I'll add such IOCTL in v6. Re GFP_ATOMIC allocations, if we're using the blocking approach, should I decrease the limit of the number of such allocations or do something else? Re licensing comments, do I need to change anything after all?