Re: [RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

Arnd Bergmann <arnd@xxxxxxxx> · Tue, 8 Sep 2020 11:14:50 +0200

Picking up the old thread again after and getting pinged by multiple
colleagues about it (thanks!) reading through the history.

On Fri, Jun 12, 2020 at 7:29 AM Viresh Kumar <viresh.kumar@xxxxxxxxxx> wrote:
>
> On 11-06-20, 19:34, Jassi Brar wrote:
> > In the first post in this thread, Viresh lamented that mailbox
> > introduces "a few ms" delay in the scheduler path.
> > Your own tests show that is certainly not the case -- average is the
> > same as proposed virtual channels 50-100us, the best case is 3us vs
> > 53us for virtual channels.
>
> Hmmm, I am not sure where is the confusion here Jassi. There are two
> things which are very very different from each other.
>
> - Time taken by the mailbox framework (and remote for acknowledging
>   it) for completion of a single request, this can be 3us to 100s of
>   us. This is clear for everyone. THIS IS NOT THE PROBLEM.
>
> - Delay introduced by few of such requests on the last one, i.e. 5
>   normal requests followed by an important one (like DVFS), the last
>   one needs to wait for the first 5 to finish first. THIS IS THE
>   PROBLEM.

Earlier, Jassi also commented "Linux does not provide real-time
guarantees", which to me is what actually causes the issue here:

Linux having timeouts when communicating to the firmware means
that it relies on the hardware and firmware having real-time behavior
even when not providing real-time guarantees to its processes.

When comparing the two usage models, it's clear that the minimum
latency for a message delivery is always at least the time time
to process an interrupt, plus at least one expensive MMIO read
and one less expensive posted MMIO write for an Ack. If we
have a doorbell plus out-of-band message, we need an extra
DMA barrier and a read from coherent memory, both of which can
be noticeable. As soon as messages are queued in the current
model, the maximum latency increases by a potentially unbounded
number of round-trips, while in the doorbell model that problem
does not exist, so I agree that we need to handle both modes
in the kernel deal with all existing hardware as well as firmware
that requires low-latency communication.

It also sounds like that debate is already settled because there
are platforms using both modes, and in the kernel we usually
end up supporting the platforms that our users have, whether
we think it's a good idea or not.

The only questions that I see in need of being answered are:

1. Should the binding use just different "#mbox-cells" values or
   also different "compatible" strings to tell that difference?
2. Should one driver try to handle both modes or should there
   be two drivers?

It sounds like Jassi strongly prefers separate drivers, which
would make separate compatible strings the more practical
approach. While the argument can be made that a single
piece of hardware should only have one DT description,
the counter-argument would be that the behavior described
by the DT here is made up by both the hardware and the
firmware behind it, and they are in fact different.

       Arnd