On Fri, 25 Jul 2014, Vinod Koul wrote: > On Fri, Jul 25, 2014 at 05:00:53PM +0200, Guennadi Liakhovetski wrote: > > Hi Vinod, > > > > On Fri, 25 Jul 2014, Vinod Koul wrote: > > > > Thanks for your review. However, I find the following a bit odd. As you > > remember, you already reviewed v2 of this driver: > > > > http://www.spinics.net/lists/dmaengine/msg00880.html > > > > and provided your comments to it, which I addressed in versions 3 and 4. > > Code, that you're commenting on here, hasn't (significantly) changed since > > v1. During your v2 review it didn't seem offending to you, now it does. > > Does this mean, that if / once I fix these issues, your next review might > > find some more _old_ code snippets, that you decide aren't appropriate? > > > > This isn't the first time such a thing is happening with various reviewers > > and patch submitters. I think a reasonable approach is to "trust your own > > review." Once I've reviewed a piece of code and found it ok, I _normally_ > > won't ask a patch author to modify code, that didn't change since my > > previous review. Simple. Of course, sometimes it does happen, that the > > first review skips some important flaws, but then I consider that my > > fault, if I didn't find them during the first round and try to find a > > solution to minimise the damage to the author. Now, to specific comments. > I agree you can blame me for not spotting them earlier and yes criticism is > fair. But then overall goal is to ensure that code is better, so even if > comment comes late we should accept it. Apparently, our approaches differ somewhat here. I explained already what I normally do in such cases. > > > On Sat, Jul 19, 2014 at 12:48:51PM +0200, Guennadi Liakhovetski wrote: > > > > This patch adds a driver for NBPF DMAC IP cores from Renesas, designed for > > > > the AMBA AXI bus. > > > > > > > > > > > +struct nbpf_desc { > > > > + struct dma_async_tx_descriptor async_tx; > > > > + bool user_wait; > > > sounds odd? > > > > Maybe it's not the best name, I can gladly try to improve it, but I'm sure > > I'm not the best "namer," so, the same can be said about more or less > > every identifier in all my code - it could be improved. However, I don't > > think it's critical enough to delay mainlining until the next kernel > > version? > No it is not critical at all! > > > > > +static int nbpf_chan_probe(struct nbpf_device *nbpf, int n) > > > > +{ > > > > + struct dma_device *dma_dev = &nbpf->dma_dev; > > > > + struct nbpf_channel *chan = nbpf->chan + n; > > > > + int ret; > > > > + > > > > + chan->nbpf = nbpf; > > > > + chan->base = nbpf->base + NBPF_REG_CHAN_OFFSET + NBPF_REG_CHAN_SIZE * n; > > > > + INIT_LIST_HEAD(&chan->desc_page); > > > > + spin_lock_init(&chan->lock); > > > > + chan->dma_chan.device = dma_dev; > > > > + dma_cookie_init(&chan->dma_chan); > > > > + nbpf_chan_prepare_default(chan); > > > > + > > > > + dev_dbg(dma_dev->dev, "%s(): channel %d: -> %p\n", __func__, n, chan->base); > > > > + > > > > + snprintf(chan->name, sizeof(chan->name), "nbpf %d", n); > > > > + > > > > + ret = devm_request_threaded_irq(dma_dev->dev, chan->irq, > > > > + nbpf_chan_irq, nbpf_chan_irqt, IRQF_SHARED, > > > > + chan->name, chan); > > > devm_request_irq and friends arent apt here. You are in .remove and get an > > > irq, what prevents that race? > > > > How is this your comment different for DMA drivers from any other drivers? > > Of course you have to make sure no more interrupts are arriving once in > > .remove() you lost the ability to handle IRQs. So, AFAIU, it's always a > > per-case study: you have to look at .remove() and consider, what happens > > if IRQ hits at any point inside it - if at all possible. Besides, DMA > > engine drivers are additionally protected by incremented module > > use-counts, once a channel is in use? As soon as a channel is released > > your driver has to make sure no more IRQs are occurring. Besides, there's > > a dmaengine_get()... So, I think, we really need to first find a race > > possibility, and then fix it. Besides, you could try > > > > grep -rl devm_request_irq drivers/dma/*.c > And please also see there were regression on few of them, recently dw was > fixed for this case. Similarly few other drivers were fixed for their > behaviour in tasklet kill, syncronizing the irq and freeing up behaviour. > > So now back to this driver, since you are using devm_ the irq is not freed, > so how does it prevent race. How is it not freed? Of course it is freed, but later - after .release() has completed. As for races - AFAIU, no interrupts should be coming after all DMA channels have been freed, and .release() will not be called as long as any channels are in use. However, I think, there might be a chance for a race if the user would try to free a DMA channel before all descriptors have completed and without calling TERMINATE_ALL. I think this would be an abnormal behaviour - usually you either wait for all descriptors to complete or terminate any remaining ones, but if such a buggy DMA slave driver would be used, and if .device_free_chan_resources() gets called (on a different CPU core) after a hard IRQ has completed and before the IRQ thread has run (this would be the same if a tasklet was used in the driver instead of an IRQ thread), i.e. CPU core #0 CPU core #1 DMA IRQ schedule IRQ thread .device_free_chan_resources() IRQ thread: nbpf_chan_irqt() then there can be a problem, because .device_free_chan_resources() would free descriptor pages and the IRQ thread would try to use them. Again, I consider that case to be a buggy DMA slave and even then such a race would be very unlikely, but theoretically it might be possible. The fix can be as simple as calling nbpf_chan_idle() from nbpf_free_chan_resources(). Would you like a new version with this single change or would such an incremental patch suffice? > If we do ensure that, then yes driver is fine > and can be merged, otherwise needs a fix > > > > > You need to run free_irq and synchronize_irq() > > > to ensure everything is proper before freeing up. > > > > > > Also why threaded_irq when dmaengine API mandates you to use tasklet? The > > > callback is supposed to be invoked from a tasklet. > > > > As for this one. Yes, I'm aware, that most other dmaengine drivers use a > > tasklet for bottom half IRQ processing. But is there a real requirement > > for that? This driver has been written in the end of last year and has > > been in use since then - no problems. As long as proper locking is used, > > an IRQ thread seems to work just fine here. And allows us to leverage the > > threaded-IRQ API instead of inventing own bottom-half processing. > > > > But if there's a _real_ essential reason to use a tasklet here, I can > > switch, np. Does it really have to be done now or would an incremental > > patch suffice? > The API mandates this for a reason. I think we can add this later as it > doesn't break the functionality. What was that reason? Is it that the network stack is using a tasklet and to make DMA usable for network TCP copy offloading? But that's only available with INTEL_IOATDMA and FSL_DMA and is broken anyway? config NET_DMA bool "Network: TCP receive copy offload" depends on DMA_ENGINE && NET default (INTEL_IOATDMA || FSL_DMA) depends on BROKEN Thanks Guennadi -- To unsubscribe from this list: send the line "unsubscribe dmaengine" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html