Re: [PATCH v2] spi: bcm2835: Enable shared interrupt support

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 6/5/2020 7:41 AM, Robin Murphy wrote:
> On 2020-06-05 14:46, Robin Murphy wrote:
>> On 2020-06-05 14:20, Mark Brown wrote:
>>> On Fri, Jun 05, 2020 at 12:34:36PM +0100, Robin Murphy wrote:
>>>> On 2020-06-04 22:28, Florian Fainelli wrote:
>>>
>>>>> For the BCM2835 case which is deemed performance critical, we would
>>>>> like
>>>>> to continue using an interrupt handler which does not have the extra
>>>>> comparison on BCM2835_SPI_CS_INTR.
>>>
>>>> FWIW, if I'm reading the patch correctly, then with sensible codegen
>>>> that
>>>> "overhead" should amount to a bit test on a live register plus a
>>>> not-taken
>>>> conditional branch - according to the 1176 TRM that should add up to a
>>>> whopping 2 cycles. If that's really significant then I'd have to wonder
>>>> whether you want to be at the mercy of the whole generic IRQ stack
>>>> at all,
>>>> and should perhaps consider using FIQ instead.
>>>
>>> Yes, and indeed the compiler does seem to manage that.  It *is* non-zero
>>> overhead though.
>>
>> True, but so's the existing level of pointer-chasing indirection that
>> with some straightforward refactoring could be taken right out of the
>> critical path and confined to just the conditional complete() call.
>> That's the kind of thing leaving me unconvinced that this is code
>> where every single cycle counts ;)
> 
> Ha, and in fact having checked a build out of curiosity, this patch
> as-is actually stands to make things considerably worse. At least with
> GCC 8.3 and bcm2835_defconfig, bcm2835_spi_interrupt_common() doesn't
> get inlined, which means bcm2835_spi_interrupt() pushes/pops a stack
> frame and makes an out-of-line call to bcm2835_spi_interrupt_common(),
> resulting in massively *more* work than the extra two instructions of
> simply inlining the test.
> 
> So yes, the overhead of inlining the test vs. the alternative is indeed
> non-zero. It's just also negative :D

Is it reliable across compiler versions if we use __always_inline?

The only other alternative that I can think of is using a static key to
eliminate the test for the single controller case. This feels highly
over engineered, but if that proves more reliable and gets everybody
their cookie, why not.

Lukas, do you have any way to test with the conditional being present
that the performance or latency does not suffer so much that it becomes
unacceptable for your use cases?
-- 
Florian



[Index of Archives]     [Device Tree Compilter]     [Device Tree Spec]     [Linux Driver Backports]     [Video for Linux]     [Linux USB Devel]     [Linux PCI Devel]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Yosemite Backpacking]


  Powered by Linux