Re: [RFC] dt-bindings: mailbox: add doorbell support to ARM MHU

Viresh Kumar <viresh.kumar@xxxxxxxxxx> · Wed, 10 Jun 2020 15:03:34 +0530

On 05-06-20, 10:42, Jassi Brar wrote:
> Since origin upto scmi_xfer, there can be many forms of sleep like
> schedule/mutexlock etc.... think of some userspace triggering sensor
> or dvfs operation. Linux does not provide real-time guarantees. Even
> if remote (scmi) firmware guarantee RT response, it makes sense to
> timeout a response only after the _request is on the bus_  and not
> when you submit a request to the api (unless you serialise it).
> IOW, start the timeout from  mbox_client.tx_prepare()  when the
> message actually gets on the bus.

There are multiple purposes of the timeout IMO:

- Returning early if the other side is dead/hung, in such a case the
  timeout can be put when the request is put on the bus as we don't
  care of the time it takes to complete the request until the time the
  request can be fulfilled. This can be a example of i2c/spi memory
  read.

- Ensuring maximum time in which the request needs to be serviced.
  There may be hard requirements, like in case for DVFS from
  scheduler's hot path (which is essential for better working of the
  overall system). And for such a case the timeout is placed at the
  right place IMO, i.e. right after a request is submitted to mailbox.

And some more points I wanted to share..

- I am not sure I understood the *serializing* part you guys were
  talking about. I believe mailbox framework is already serializing
  the requests it is receiving on a single channel with a spin lock,
  right ? Why does the client need to serialize them as well? Is that
  for avoiding timeouts ?

- For me, and Sudeep as well IIUC, the bigger problem isn't that
  timeouts are happening and requests are failing (and so changing the
  timeout to a bigger value isn't going to fix anything), but the
  problem is that it is taking too long (because of the queue of
  requests on a channel) for a request to finish after being
  submitted. Scheduler doesn't care of the underneath logistics for
  example, all it cares for is the time it takes to change the
  frequency of a CPU. If you can do it fast enough in a guaranteed
  manner, then you can use fast switching, otherwise not.

- The hardware can very well support the case today where this can be
  done in parallel and (almost) in a guaranteed time-frame. While the
  software wants to add a limit to that and so wants to serialize
  requests.

- As many people have already suggested it (like me, Sudeep, Rob,
  maybe Bjorn as well), it seems silly to not allow driving the h/w in
  the most efficient way possible (and allow fast cpu switching in
  this case).

> Interesting logs !  The time taken to complete _successful_ requests
> are arguably better in bad_trace ... there are many <10usec responses
> in bad_trace, while the fastest response in good_trace is  53usec.

Indeed this is interesting. It may be worth looking (separately) into
why don't we see those 3 us long requests anymore, or maybe they were
just not there in the logs.

> And the requests that 'fail/timeout' are purely the result of not
> serialising them or checkout for timeout at wrong place as explained
> above.

We can't allow for the requests to go on for ever in some cases, while
in other cases it may be absolutely fine.

-- 
viresh