On Thu, Aug 07, 2014 at 04:37:16PM -0300, Emilio López wrote: > Hi, > > El 05/08/14 a las 17:00, Maxime Ripard escibió: > >Hi, > > > >Overall it looks very nice, thanks. > > > >On Mon, Aug 04, 2014 at 05:09:55PM -0300, Emilio López wrote: > >>+static struct sun4i_dma_pchan *find_and_use_pchan(struct sun4i_dma_dev *priv, > >>+ struct sun4i_dma_vchan *vchan) > >>+{ > >>+ struct sun4i_dma_pchan *pchan = NULL, *pchans = priv->pchans; > >>+ unsigned long flags; > >>+ int i, max; > >>+ > >>+ /* > >>+ * pchans 0-NDMA_NR_MAX_CHANNELS are normal, and > >>+ * NDMA_NR_MAX_CHANNELS+ are dedicated ones > >>+ */ > >>+ if (vchan->is_dedicated) { > >>+ i = NDMA_NR_MAX_CHANNELS; > >>+ max = DMA_NR_MAX_CHANNELS; > >>+ } else { > >>+ i = 0; > >>+ max = NDMA_NR_MAX_CHANNELS; > >>+ } > >>+ > >>+ spin_lock_irqsave(&priv->lock, flags); > >>+ for_each_clear_bit_from(i, &priv->pchans_used, max) { > >>+ pchan = &pchans[i]; > >>+ pchan->vchan = vchan; > >>+ set_bit(i, priv->pchans_used); > >>+ break; > > > >ffz instead? > > I think it'd look way more messy with ffz, as it only works on > unsigned longs and it's undefined on the all-1s case. > > We could have something like this, but I don't see much point in > what's basically expanding the macro > > spin_lock_irqsave(&priv->lock, flags); > i = find_next_zero_bit(&priv->pchans_used, max, i); > if (i < max) { > pchan = &pchans[i]; > pchan->vchan = vchan; > set_bit(i, priv->pchans_used); > } > spin_unlock_irqrestore(&priv->lock, flags); What about spin_lock_irqsave(&priv->lock, flags); i = ffz(&priv->pchans_used); if (!i) return NULL; pchan = &pchans[i]; pchan->vchan = vchan; set_bit(i, priv->pchans_used); spin_unlock_irqrestore(&priv->lock, flags); It looks quite clean to me. > >>+static irqreturn_t sun4i_dma_submit_work(int irq, void *dev_id) > >>+{ > >>+ struct sun4i_dma_dev *priv = dev_id; > >>+ struct sun4i_dma_vchan *vchan; > >>+ unsigned long flags; > >>+ int i; > >>+ > >>+ for (i = 0; i < DMA_NR_MAX_VCHANS; i++) { > >>+ vchan = &priv->vchans[i]; > >>+ spin_lock_irqsave(&vchan->vc.lock, flags); > >>+ execute_vchan_pending(priv, vchan); > >>+ spin_unlock_irqrestore(&vchan->vc.lock, flags); > >>+ } > >>+ > >>+ return IRQ_HANDLED; > >>+} > > > >Judging from Russell's comment here, that should be in the interrupt > >handler > > > >http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/277737.html > > Wouldn't that make the interrupt handler quite big time wise? I was > under the impression they should remain as short as possible. LDD3 > says "The programmer should be careful to write a routine that > executes in a minimum amount of time (...)" on chapter 10 Yeah, but if that involves delaying a lot a task that should have been performed asap, it would make sense. Plus, besides the loop in the contract's demands, execute_vchan_pending is actually rather small. Did you timed it to have such concerns? > I did try something mid-way a bit ago; I added code to see if we > could issue the next DMA operation from the current set, but it was > just extra complexity for a noop in performance, as they're sets of > one on all cases I tried. > > It's worth noting that the very latency sensitive DMA users we will > have (ie, audio) are already completely managed in the IRQ handler > thanks to the less flexible concept of cyclic transfers. It's not just a matter of latency, but also of throughput. If you wait for the tasklet softirq to be executed to schedule a new vchan, you aren't going to do anything during that time, while you could have. Maxime -- Maxime Ripard, Free Electrons Embedded Linux, Kernel and Android engineering http://free-electrons.com
Attachment:
signature.asc
Description: Digital signature