Re: [PATCH v2 4/5] misc: fastrpc: Add polling mode support for fastRPC driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/20/2025 7:45 PM, Dmitry Baryshkov wrote:
> On Thu, Mar 20, 2025 at 07:19:31PM +0530, Ekansh Gupta wrote:
>>
>> On 1/29/2025 4:10 PM, Dmitry Baryshkov wrote:
>>> On Wed, Jan 29, 2025 at 11:12:16AM +0530, Ekansh Gupta wrote:
>>>>
>>>> On 1/29/2025 4:59 AM, Dmitry Baryshkov wrote:
>>>>> On Mon, Jan 27, 2025 at 10:12:38AM +0530, Ekansh Gupta wrote:
>>>>>> For any remote call to DSP, after sending an invocation message,
>>>>>> fastRPC driver waits for glink response and during this time the
>>>>>> CPU can go into low power modes. Adding a polling mode support
>>>>>> with which fastRPC driver will poll continuously on a memory
>>>>>> after sending a message to remote subsystem which will eliminate
>>>>>> CPU wakeup and scheduling latencies and reduce fastRPC overhead.
>>>>>> With this change, DSP always sends a glink response which will
>>>>>> get ignored if polling mode didn't time out.
>>>>> Is there a chance to implement actual async I/O protocol with the help
>>>>> of the poll() call instead of hiding the polling / wait inside the
>>>>> invoke2?
>>>> This design is based on the implementation on DSP firmware as of today:
>>>> Call flow: https://github.com/quic-ekangupt/fastrpc/blob/invokev2/Docs/invoke_v2.md#5-polling-mode
>>>>
>>>> Can you please give some reference to the async I/O protocol that you've
>>>> suggested? I can check if it can be implemented here.
>>> As with the typical poll() call implementation:
>>> - write some data using ioctl
>>> - call poll() / select() to wait for the data to be processed
>>> - read data using another ioctl
>>>
>>> Getting back to your patch. from you commit message it is not clear,
>>> which SoCs support this feature. Reminding you that we are supporting
>>> all kinds of platforms, including the ones that are EoLed by Qualcomm.
>>>
>>> Next, you wrote that in-driver polling eliminates CPU wakeup and
>>> scheduling. However this should also increase power consumption. Is
>>> there any measurable difference in the latencies, granted that you
>>> already use ioctl() syscall, as such there will be two context switches.
>>> What is the actual impact?
>> Hi Dmitry,
>>
>> Thank you for your feedback.
>>
>> I'm currently reworking this change and adding testing details. Regarding the SoC
>> support, I'll add all the necessary information.
> Please make sure that both the kernel and the userspace can handle the
> 'non-supported' case properly.

Yes, I will include changes to handle in both userspace and kernel.

>
>> For now, with in-driver
>> polling, we are seeing significant performance improvements for calls
>> with different sized buffers. On polling supporting platform, I've observed an
>> ~80us improvement in latency. You can find more details in the test
>> results here: 
>> https://github.com/quic/fastrpc/pull/134/files#diff-7dbc6537cd3ade7fea5766229cf585db585704e02730efd72e7afc9b148e28ed
> Does the improvement come from the CPU not goint to idle or from the
> glink response processing?

Although both are contributing to performance improvement, the major
improvement is coming from CPU not going to idle state.

Thanks,
Ekansh

>
>> Regarding your concerns about power consumption, while in-driver polling
>> eliminates CPU wakeup and scheduling, it does increase power consumption.
>> However, the performance gains seem to outweigh this increase.
>>
>> Do you think the poll implementation that you suggested above could provide similar
>> improvements?
> No, I agree here. I was more concentrated on userspace polling rather
> than hw polling.
>





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux