Hi Balbi, On 08/03/16 07:37, Felipe Balbi wrote: > > Hi, > > Felipe Ferreri Tonello <eu@xxxxxxxxxxxxxxxxx> writes: >>>>>> Since f_midi_transmit is called by both ALSA and USB frameworks, it >>>>> can >>>>>> potentially cause a race condition between both calls. This is bad >>>>> because the >>>>>> way f_midi_transmit is implemented can't handle concurrent calls. >>>>> This is due >>>>>> to the fact that the usb request fifo looks for the next element and >>>>> only if >>>>>> it has data to process it enqueues the request, otherwise re-uses it. >>>>> If both >>>>>> (ALSA and USB) frameworks calls this function at the same time, the >>>>>> kfifo_seek() will return the same usb_request, which will cause a >>>>> race >>>>>> condition. >>>>>> >>>>>> To solve this problem a syncronization mechanism is necessary. In >>>>> this case it >>>>>> is used a spinlock since f_midi_transmit is also called by >>>>> usb_request->complete >>>>>> callback in interrupt context. >>>>>> >>>>>> On benchmarks realized by me, spinlocks were more efficient then >>>>> scheduling >>>>>> the f_midi_transmit tasklet in process context and using a mutex >>>>>> to synchronize. Also it performs better then previous >>>>>> implementation >>>>> that >>>>>> allocated a usb_request for every new transmit made. >>>>> >>>>> behaves better in what way ? Also, previous implementation would not >>>>> suffer from this concurrency problem, right ? >>>> >>>> The spin lock is faster than allocating usb requests all the time, >>>> even if the udc uses da for it. >>> >>> did you measure ? Is the extra speed really necessary ? How did you >>> benchmark this ? >> >> Yes I did measure and it was not that significant. This is not about >> speed. There was a bug in that approach that I already explained on > > you have very confusing statements. When I mentioned that previous code > wouldn't have the need for the spinlock you replied that spinlock was > faster. > > When I asked you about benchmarks you reply saying it's not about the > speed. > > Make up your mind dude. What are you trying to achieve ? > >> that patch, which was approved and applied BTW. > > patches can be reverted if we realise we're better off without > them. Don't get cocky, please. Yes am I aware of that, but I honestly think that is the wrong way of dealing with this. ?? I don't get why am I giving this impression. > >> Any way, this spinlock should've been there since that patch but I >> couldn't really trigger this problem without a stress test. > > which tells me you sent me patches without properly testing. How much > time did it take to trigger this ? How did you trigger this situation ? No, that is no true. The implementation I sent is working properly for any real world usage. The stress test I made to break the current implementation is *not* a real use-case. I made it in order to push as far as possible how fast the driver can *reliably* handle while sending and reading data. Then I noticed the bug. So, to answer your question. To trigger this bug is not a matter of time. The following needs to happen: 1. Device send MIDI message that is *bigger* than the usb request length. (just this by itself is really unlikely to happen in real world usage) 2. Host send a MIDI message back *exactly* at the same time as the device is processing the second part of the usb request from the same message. I couldn't trigger this in all the tests we've made. I just triggered when I was sending huge messages back and forth (device <-> host) as mentioned. In fact, we have thousands of devices out there without this patch (but with my previous patch that introduced this bug). I am not trying to say it wasn't a mistake. That patch unfortunately introduces this bug, but it has real improvements over the previous implementation. AFAIR the improvements are: * Fixes a bug that was causing the DMA buffer to fill it up causing a kernel panic. * Pre allocate IN usb requests so there is no allocation overhead while sending data (same behavior already existed for the OUT endpoint). This ensure that the DMA memory is not misused affecting the rest of the system. * It doesn't crash if the host doesn't send an ACK after IN data packets and we have reached the limit of available memory. Also, this is useful because it causes the ALSA layer to timeout, which is the correct userspace behavior. * Continuous to send data to the correct Jack (associated to each ALSA substream) if that was interrupted somehow, for instance by the size limit of a usb request. > >> So, this patch fixes a bug in the current implementation. > > fixes a regression introduced by you, true. I'm trying to figure out if > we're better off without the original patch; to make a good decision I > need to know if the extra "speed" we get from not allocating requests on > demand are really that important. > > So, how much faster did you get and is that extra "speed" really > important ? The speed is not relevant at all in this case. It was not the goal of the patch, but I mentioned because it is obvious that with no memory allocation there will be an increase of speed that the code is executed. I did measure the speed improvements at that time, it was real but not relevant. I don't think we should be discussing this anyway. -- Felipe
Attachment:
0x92698E6A.asc
Description: application/pgp-keys