On 8/9/2022 6:13 AM, Robin Murphy wrote:
[drive-by observation since one thing caught my interest...] >
Appreciate all the comments.
Jassi,
I understood you have talked with some of our folks (Trilok and Carl) a
few years ago about using the mailbox APIs. We were steered away from
using mailboxes then. Is that still the recommendation today?
On 2022-08-09 00:38, Elliot Berman wrote:
I might be completely wrong about this, but if my in-mind picture of
Gunyah is correct, I'd have implemented the gunyah core subsytem as
mailbox provider, RM as a separate platform driver consuming these
mailboxes and in turn being a remoteproc driver, and consoles as
remoteproc subdevices. >
The mailbox framework can only fit with message queues and not
doorbells or vCPUs.
Is that so? There was a whole long drawn-out saga around the SCMI
protocol using the Arm MHU mailbox as a set of doorbells for
shared-memory payloads, but it did eventually get merged as the separate
arm_mhu_db.c driver, so unless we're talking about some completely
different notion of "doorbell"... :/
Doorbells will be harder to fit into mailbox API framework.
- Simple doorbells don't have any TX done acknowledgement model at
the doorbell layer (see bullet 1 from
https://lore.kernel.org/all/68e241fd-16f0-96b4-eab8-369628292e03@xxxxxxxxxxx/).
Doorbell clients might have a doorbell acknowledgement flow, but the
only client I have for doorbells doesn't. IRQFDs would send an
empty message to the mailbox and immediately do a client-triggered
TX_DONE.
- Using mailboxes for the more advanced use-case doorbell forces client
to use doorbells a certain way because each channel could be a bit on
the bitmask, or the client could have complete control of the entire
bitmask. I think implementing the mailbox API would force the
otherwise-generic doorbell code to make that decision for clients.
Further, I wanted to highlight one other challenge with fitting Gunyah
message queues into mailbox API:
- Message queues track a flag which indicates whether there is space
available in the queue. The flag is returned on msgq_send. When the
message queue is full, an interrupt is raised when there is more
space available. This could be used as a TX_DONE indicator, but
mailbox framework's API prevents us from doing mbox_chan_txdone
inside the send_data channel op.
I think this might be solvable by adding a new txdone mechanism.
The mailbox framework also relies on the mailbox being defined in the
devicetree. RM is an exceptional case in that it is described in the
devicetree. Message queues for other VMs would be dynamically created
at runtime as/when that VM is created. Thus, the client of the message
queue would need to "own" both the controller and client ends of the
mailbox.
FWIW, if the mailbox API does fit conceptually then it looks like it
shouldn't be *too* hard to better abstract the DT details in the
framework itself and allow providers to offer additional means to
validate channel requests, which might be more productive than inventing
a whole new thing. >
Some notes about fitting mailboxes into Gunyah IPC:
- A single mailbox controller can't cover all the gunyah devices. The
number of gunyah devices is not fixed and varies per VM launched.
Mailbox controller would need to be per-VM or per-device, where each
channel represents a capability.
- The other device types (like vCPU) don't fit into message-based
style framework. I'd like to have a consistent way of binding a
device's function with the device. If we use mailbox API, some
devices will use mailbox and others will use some other mechanism.
I'd prefer to consistently use "some other mechanism" throughout.
- TX and RX message queues are independent and "combining" a TX and RX
message queue happens at client layer by the client requesting access
to two otherwise unassociated message queues. A mailbox channel would
either be associated with a TX message queue capability or an RX
message queue capability. This isn't a major hurdle per se, but it
decreases how cleanly we can use the mailbox APIs IMO.
- A VM might only have a TX message queue and no RX message queue,
or vice versa. We won't be able to require coupling a TX and RX
message queue for the mailbox.
- TX done acknowledgement doesn't fit Gunyah IPC (see above) and a new
TX_DONE mode would need to be implemented.
- Need to make it possible for a client to binding a mailbox channel
without DT.
I'm getting a bit apprehensive about the tweaks needed to make mailbox
framework usable for Gunyah. Will there be enough code re-use and help
with abstracting the direct-to-Gunyah APIs? IMO, there isn't, but
opinions are welcome :)
Thanks,
Elliot