Hi Balbi, On 08/03/16 14:01, Felipe Balbi wrote: > > Hi, > > Felipe Ferreri Tonello <eu@xxxxxxxxxxxxxxxxx> writes: >>>>>>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it >>>>>>> can >>>>>>>> potentially cause a race condition between both calls. This is bad >>>>>>> because the >>>>>>>> way f_midi_transmit is implemented can't handle concurrent calls. >>>>>>> This is due >>>>>>>> to the fact that the usb request fifo looks for the next element and >>>>>>> only if >>>>>>>> it has data to process it enqueues the request, otherwise re-uses it. >>>>>>> If both >>>>>>>> (ALSA and USB) frameworks calls this function at the same time, the >>>>>>>> kfifo_seek() will return the same usb_request, which will cause a >>>>>>> race >>>>>>>> condition. >>>>>>>> >>>>>>>> To solve this problem a syncronization mechanism is necessary. In >>>>>>> this case it >>>>>>>> is used a spinlock since f_midi_transmit is also called by >>>>>>> usb_request->complete >>>>>>>> callback in interrupt context. >>>>>>>> >>>>>>>> On benchmarks realized by me, spinlocks were more efficient then >>>>>>> scheduling >>>>>>>> the f_midi_transmit tasklet in process context and using a mutex >>>>>>>> to synchronize. Also it performs better then previous >>>>>>>> implementation >>>>>>> that >>>>>>>> allocated a usb_request for every new transmit made. >>>>>>> >>>>>>> behaves better in what way ? Also, previous implementation would not >>>>>>> suffer from this concurrency problem, right ? >>>>>> >>>>>> The spin lock is faster than allocating usb requests all the time, >>>>>> even if the udc uses da for it. >>>>> >>>>> did you measure ? Is the extra speed really necessary ? How did you >>>>> benchmark this ? >>>> >>>> Yes I did measure and it was not that significant. This is not about >>>> speed. There was a bug in that approach that I already explained on >>> >>> you have very confusing statements. When I mentioned that previous code >>> wouldn't have the need for the spinlock you replied that spinlock was >>> faster. >>> >>> When I asked you about benchmarks you reply saying it's not about the >>> speed. >>> >>> Make up your mind dude. What are you trying to achieve ? >>> >>>> that patch, which was approved and applied BTW. >>> >>> patches can be reverted if we realise we're better off without >>> them. Don't get cocky, please. >> >> Yes am I aware of that, but I honestly think that is the wrong way of >> dealing with this. >> >> ?? I don't get why am I giving this impression. > > re-read your emails. The gist goes like this: > > . Send patch > . Got comments > . Well, whatever, you can just ignore if you don't agree This is one of the problems with email. It can give the wrong impression and feelings. :) That was not what I meant at all. I mean that for real, not in a childish manner. I'm sorry if I gave you that impression. > >>>> Any way, this spinlock should've been there since that patch but I >>>> couldn't really trigger this problem without a stress test. >>> >>> which tells me you sent me patches without properly testing. How much >>> time did it take to trigger this ? How did you trigger this situation ? >> >> No, that is no true. The implementation I sent is working properly for >> any real world usage. >> >> The stress test I made to break the current implementation is *not* a >> real use-case. I made it in order to push as far as possible how fast >> the driver can *reliably* handle while sending and reading data. Then I >> noticed the bug. >> >> So, to answer your question. To trigger this bug is not a matter of >> time. The following needs to happen: >> 1. Device send MIDI message that is *bigger* than the usb request >> length. (just this by itself is really unlikely to happen in real world >> usage) > > I wouldn't say it's unlikely. You just cannot trust the other side of > the wire. We've seen e.g. Xbox 360's SCSI layer sending messages of the > wrong size and we worked around them in g_mass_storage. > > Broken implementations are a real thing ;-) Fair enough. And that's why I am pushing this fix. :) > >> 2. Host send a MIDI message back *exactly* at the same time as the >> device is processing the second part of the usb request from the same >> message. > > also not that unlikely to happen ;-) You can't assume the host will only > shift tokens on the wire at the time you're expecting it to. > >> I couldn't trigger this in all the tests we've made. I just triggered >> when I was sending huge messages back and forth (device <-> host) as >> mentioned. > > fair enough. > >> In fact, we have thousands of devices out there without this patch (but >> with my previous patch that introduced this bug). > > that's thousands of devices waiting to have a problem, right ? :-) :X > >> I am not trying to say it wasn't a mistake. That patch unfortunately >> introduces this bug, but it has real improvements over the previous >> implementation. AFAIR the improvements are: >> * Fixes a bug that was causing the DMA buffer to fill it up causing a >> kernel panic. > > this is a good point. Had forgotten about that detail. Thanks > >> * Pre allocate IN usb requests so there is no allocation overhead while >> sending data (same behavior already existed for the OUT endpoint). This >> ensure that the DMA memory is not misused affecting the rest of the >> system. > > also, arguably, a good idea. Recycling requests is a lot nicer and it's > what most gadget drivers do. > >> * It doesn't crash if the host doesn't send an ACK after IN data >> packets and we have reached the limit of available memory. Also, this is >> useful because it causes the ALSA layer to timeout, which is the correct >> userspace behavior. > > right > >> * Continuous to send data to the correct Jack (associated to each ALSA >> substream) if that was interrupted somehow, for instance by the size >> limit of a usb request. > > ok. > >>>> So, this patch fixes a bug in the current implementation. >>> >>> fixes a regression introduced by you, true. I'm trying to figure out if >>> we're better off without the original patch; to make a good decision I >>> need to know if the extra "speed" we get from not allocating requests on >>> demand are really that important. >>> >>> So, how much faster did you get and is that extra "speed" really >>> important ? >> >> The speed is not relevant at all in this case. It was not the goal of >> the patch, but I mentioned because it is obvious that with no memory >> allocation there will be an increase of speed that the code is executed. >> >> I did measure the speed improvements at that time, it was real but not >> relevant. I don't think we should be discussing this anyway. > > fair enough. This was probably the first email from you which gave me > some peace of mind that you know what you're doing with this fix. Keep > in mind that we all receive hundreds of emails a day and it's difficult > to track things over time. True. I will try to keep this always in mind. > > It's also a big PITA when someone sends fixes and cleanups on the same > series and/or with dependencies between them. The correct way is to send > *only* fixes first. They should be minimal patches that *only* fix the > problem. If the code looks messy or doesn't follow the coding style, > that's something you do on a completely separate fix and, usually, from > a clean topic branch starting at a tag from Linus (exceptions may arise, > of course). Got it. > > So anyway, to finally finish this up. Can you send JUST the bare minimum > fix necessary to avoid the regression ? Also, add a proper Fixes: foobar > line on commit log (see commit e18b7975c885bc3a938b9a76daf32957ea0235fa > for an example). > > Then we can get that merged. Keep in mind that you might have to Cc > stable (see same commit listed above). Ok. I will send the state-machine refactor as another patch in another topic then. > > After this is sorted out, then let's see how we can help you move your > product to libusbgx and check if there's anything missing in configfs > to cope with your use-case. That will be great, thanks! I will keep the list posted. > > ps: can you point me to your devices shipping with f_midi ? Which > architecture are they using ? Which USB Peripheral Controller ? This > might be a good addition to my test farm depending on your answers above > :-p Seaboard GRAND[1]. Freescale's i.MX 6 running an ARM A9. The controller is Chip Idea. [1] https://www.roli.com/products/seaboard-grand -- Felipe
Attachment:
0x92698E6A.asc
Description: application/pgp-keys