Re: at91-sama5d2_adc crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 15, 2020 at 7:30 PM <Eugen.Hristev@xxxxxxxxxxxxx> wrote:
>
> On 14.09.2020 15:37, Alexandru Ardelean wrote:
>
> > On Mon, Sep 14, 2020 at 2:33 PM <Eugen.Hristev@xxxxxxxxxxxxx> wrote:
> >>
> >> Hello Alex,
> >>
> >> Sorry to disturb but we have issues again with this patch :
> >>
> >>
> >
> > firstly, many apologies for the breakage;
> >
> >>
> >> f3c034f61775 ("iio: at91-sama5d2_adc: adjust
> >> iio_triggered_buffer_{predisable,postenable} positions")
> >>
> >> I recently discovered a crash when using buffered trigger with DMA with
> >> this driver:
> >>
> >> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/length
> >> # echo 100 > /sys/bus/iio/devices/iio\:device0/buffer/watermark
> >> # echo 1 > /sys/bus/iio/devices/iio\:device0/scan_elements/in_voltage4_en
> >> # iio_generic_buffer -n fc030000.adc -t
> >> fc030000.adc-dev0-external_rising -c 5
> >> iio device number being used is 0
> >> iio trigger number being used is 0
> >> /sys/bus/iio/devicii/iio:device0 fc030000.adc-dev0-external_risingo
> >> iio:device0: using dma0chan10 for rx DMA transfers
> >>
> >> Division by zero in kernel.
> >> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
> >> Hardware name: Atmel SAMA5
> >> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
> >> [<c010a034>] (show_stack) from [<c03a892c>] (Ldiv0+0x8/0x10)
> >> [<c03a892c>] (Ldiv0) from [<c03a88fc>] (__aeabi_uidivmod+0x8/0x18)
> >> [<c03a88fc>] (__aeabi_uidivmod) from [<c03592d4>] (div_s64_rem+0x3c/0xc4)
> >> [<c03592d4>] (div_s64_rem) from [<c05ed344>]
> >
> > which of these lines is the issue?
> >
> >          sample_count = div_s64(transferred_len, sample_size);
> >
> >          /*
> >           * interval between samples is total time since last transfer handling
> >           * divided by the number of samples (total size divided by sample size)
> >           */
> >          interval = div_s64((ns - st->dma_st.dma_ts), sample_count);
> >
> > would a simple return like below make sense?
> > if (!sample_size)    // or  if (!sample_count)
> >      return;
> >
> > Is at91_adc_dma_size_done() returning 0?
> > Or more closely, which variable is zero?
> >
> >> (at91_adc_trigger_handler+0xcc/0x494)
> >> [<c05ed344>] (at91_adc_trigger_handler) from [<c014944c>]
> >> (irq_thread_fn+0x1c/0x78)
> >> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
> >> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
> >> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
> >> Exception stack(0xde49dfb0 to 0xde49dff8)
> >> dfa0:                                     00000000 00000000 00000000
> >> 00000000
> >> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >> 00000000
> >> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >> Division by zero in kernel.
> >> CPU: 0 PID: 243 Comm: irq/182-fc03000 Not tainted 5.8.0-rc1 #1
> >> Hardware name: Atmel SAMA5
> >> [<c010caf0>] (unwind_backtrace) from [<c010a034>] (show_stack+0x10/0x14)
> >> [<c010a034>] (show_stack) from [<c03a7e24>] (Ldiv0_64+0x8/0x18)
> >> [<c03a7e24>] (Ldiv0_64) from [<c0359344>] (div_s64_rem+0xac/0xc4)
> >> [<c0359344>] (div_s64_rem) from [<c05ed360>]
> >> (at91_adc_trigger_handler+0xe8/0x494)
> >> [<c05ed360>] (at91_adc_trigger_handler) from [<c014944c>]
> >> (irq_thread_fn+0x1c/0x78)
> >> [<c014944c>] (irq_thread_fn) from [<c01496dc>] (irq_thread+0x124/0x1d0)
> >> [<c01496dc>] (irq_thread) from [<c01325b4>] (kthread+0x138/0x140)
> >> [<c01325b4>] (kthread) from [<c0100148>] (ret_from_fork+0x14/0x2c)
> >> Exception stack(0xde49dfb0 to 0xde49dff8)
> >> dfa0:                                     00000000 00000000 00000000
> >> 00000000
> >> dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
> >> 00000000
> >> dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
> >> sched: RT throttling activated
> >>
> >>
> >> It looks like crash is there since 5.8-rc1 introduced by that patch
> >>
> >> I looked in the code and it looks something is zero, probably the
> >> received buffer size. It is likely that the DMA starts too soon before
> >> the buffer is properly setup ?
> >
> > I am not sure about this. See [1]
> >
> >>
> >> Can you help with fixing this ? or you know how we can do it ?
> >> Also could you remind me why we enable and start the DMA on pre-enable
> >> of the buffer instead of post-enable of the buffer ?
> >
> > [1]
> > So, in iio_enable_buffers(), the order is:
> > preenable()
> > update_scan_mode()
> > hwfifo_set_watermark()
> > iio_buffer_enable()    <-  this can also enable the DMA buffer if
> > using the IIO DMA buffer framework
> > [2] IIO-core-now-attaches-pollfunc via iio_trigger_attach_poll_func()
> > (if current mode is INDIO_BUFFER_TRIGGERED)
> > postenable()
> >
> > I could guess maybe that hwfifo_set_watermark() is called too late?
> > Ideally, the driver would use the IIO DMA buffer hook to enable the
> > buffer via iio_buffer_enable()
> > That would be the ideal way to do it, and not enable DMA in preenable.
>
> Hi,
>
> you are correct. The set watermark is being set too late. And the
> dma_init is called when a watermark higher than 1 is being set. In
> consequence, DMA is not initialized at the moment of dma_start (which is
> now in preenable, which is before the set_watermark).
>
> Do you have any suggestion about how to fix this ? Move the start dma to
> set_watermark maybe ?

I think as a quick fix, calling the preenable() in set_watermark()
could be an idea.
If this solves the issue, then I would do it now.
Long-term it may make sense to re-organize the hooks a bit, and
probably use the IIO DMA buffer framework.
I don't know if the current form for IIO DMA buffer is sufficient for
this driver, but extensions are always an idea.

>
> Thanks,
> Eugen
> >
> > Regarding [2] it was discussed that attach/detach should be done in
> > that order, and it should be done in IIO core.
> > So, all drivers were cleaned up to respect that order.
> > The idea is more in the lines that repetitive/duplicated things should
> > be done by the IIO framework, and attach/detach-of-pollfunc is one of
> > them.
> >
> >> In pre-enable, do we have everything ready inside IIO to be able to
> >> start the DMA? Or it's better to have it at post-enable time ?
> >
> > This really depends on each driver.
> >
> >>
> >> I know you want to ditch the post-enable and pre-disable hooks, but it
> >> looks this driver needs them, or we need to find a way to make it work
> >> properly
> >
> > These hooks have already been dropped in favor of the IIO core doing
> > attach/detach.
> > At this point, it may be a bit more difficult to go back, than to move forward.
> >
> > I think this crash should be fixable to function with this IIO core change.
> >
> > Again, apologies for the breakage.
> >
> > Thanks
> > Alex
> >
> >>
> >> Thanks !
> >> Eugen
>



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Input]     [Linux Kernel]     [Linux SCSI]     [X.org]

  Powered by Linux