Re: [PATCH] media: Support Intersil/Techwell TW686x-based video capture cards

Ezequiel Garcia <ezequiel@xxxxxxxxxxxxxxxxxxxx> · Mon, 18 Jan 2016 12:31:17 -0300

On 18 January 2016 at 12:28, Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
> On 01/18/2016 04:20 PM, Ezequiel Garcia wrote:
>> Hi Hans,
>>
>> On 18 January 2016 at 10:02, Hans Verkuil <hverkuil@xxxxxxxxx> wrote:
>>> Hi Ezequiel,
>>>
>>> Thanks for working on this! Do you know where I can get a board tw686x board?
>>> I always like to have hardware to test the driver, if at all possible.
>>>
>>
>> No, I don't know. I have one to spare, and I could send it to you.
>>
>>> See below for a review of this driver.
>
> <snip>
>
>>>> +     /*
>>>> +      * This allows to detect device is not here,
>>>> +      * and will be used by vb2_ops. The lock is really
>>>> +      * important here.
>>>> +      */
>>>> +     spin_lock_irqsave(&dev->lock, flags);
>>>> +     dev->pci_dev = NULL;
>>>> +     spin_unlock_irqrestore(&dev->lock, flags);
>>>
>>> As you sure this is needed? Normally you can only come here if the module
>>> is removed, which isn't possible as long as userspace is using it. And if
>>> the module is removed, then vb2 shouldn't be called at all.
>>>
>>> The only exception would be if this is a hot-pluggable device, which is
>>> quite unlikely for a PCI device. I don't believe any of the pci drivers
>>> support that.
>>>
>>
>> A previous version of the driver didn't have that. However, under certain
>> stress testing it was observed that the PCIe link goes down. I still have the
>> traces for that:
>>
>> [..]
>> [21833.389031] pciehp 0000:13:01.0:pcie24: pcie_isr: intr_loc 100
>> [21833.389035] pciehp 0000:13:01.0:pcie24: Data Link Layer State change
>> [21833.389038] pciehp 0000:13:01.0:pcie24: slot(1-5): Link Down event
>> [21833.389076] pciehp 0000:13:01.0:pcie24: Disabling
>> domain:bus:device=0000:14:00
>> [21833.389078] pciehp 0000:13:01.0:pcie24: pciehp_unconfigure_device:
>> domain:bus:dev = 0000:14:00
>> [21833.389103] TW686x 0000:14:00.0: removing
>> [21833.416557] TW686x 0000:14:00.0: removed
>> [..]
>>
>> I have no idea why the link goes down (hardware issue?),
>> but it's better to handle it gracefully :)
>
> This definitely needs to be documented.
>
> <snip>
>
>>>> +             /* handle video stream */
>>>> +             spin_lock_irqsave(&vc->qlock, flags);
>>>> +             if (vc->curr_bufs[pb]) {
>>>> +                     vb = &vc->curr_bufs[pb]->vb;
>>>> +                     tw686x_buffer_copy(vc, pb, vb);
>>>
>>> You have to copy the data? It's not possible the program the DMA so that
>>> it DMAs into the buffer itself? That's quite unusual for a PCI device.
>>>
>>
>> Yes, it's possible and I spent an enormous amount of time trying to make it work
>> (originally using scatter-gather mode, and then with frame mode).
>>
>> However, despite my many efforts it always stucked (sooner or later in
>> the tests)
>> into a hard machine freeze. There are two apparent sources for the freeze:
>>
>> (1) To make the above work you need to program the registers so the chip DMAs
>> into a new buffer each time the current DMA buffer is completed.
>>
>> (2) Also, when a signal error is detected and/or signal is lost and recovered,
>> the DMA channels are re-programmed as well.
>>
>> It was only when all the registers write got removed and minimized to the bare
>> minimum (registers are written before streaming starts and then stay mostly
>> untouched) that I got a stable driver working fine for several weeks.
>>
>> The ugly delay timer is meant to mitigate (2). And the buffer copy is
>> to workaround (1).
>>
>> Chip and board vendors couldn't provide any explanation for this
>> behavior. I have
>> two different boards (one with 1-chip, one with 2-chips and a PCIe switch),
>> and the issues are present on both.
>>
>> In any case, the vendor's Windows driver does the similar buffer-copy.
>>
>> I understand that on some platforms this implementation could be too
>> costly (it's
>> completely cheap on any modern x86), and I intend to provide some option
>> to provide "frame DMA-to-buffer" and "scatter-gather DMA".
>>
>> However, I wanted to get this basic version merged first.
>>
>> (Sorry, I should have included all this in the cover letter since
>> it was pretty obvious you would ask :)
>
> This too definitely needs to be documented in the code.
>
> For both issues it is not enough to document that in the cover letter,
> since future maintainers of the code will not see that. It really has to
> be in the code itself.
>
> These are typical workarounds for weird hardware behavior that isn't documented
> anywhere and that future developers are inclined to remove if it isn't clearly
> stated why they are needed.
>

No problem.

Thanks,
-- 
Ezequiel García, VanguardiaSur
www.vanguardiasur.com.ar
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html