[PATCH v2 00/11] Drivers for gunyah hypervisor
Elliot Berman
quic_eberman at quicinc.com
Tue Aug 9 17:07:39 PDT 2022
On 8/9/2022 6:13 AM, Robin Murphy wrote:
> [drive-by observation since one thing caught my interest...] >
Appreciate all the comments.
Jassi,
I understood you have talked with some of our folks (Trilok and Carl) a
few years ago about using the mailbox APIs. We were steered away from
using mailboxes then. Is that still the recommendation today?
> On 2022-08-09 00:38, Elliot Berman wrote:
>>> I might be completely wrong about this, but if my in-mind picture of
>>> Gunyah is correct, I'd have implemented the gunyah core subsytem as
>>> mailbox provider, RM as a separate platform driver consuming these
>>> mailboxes and in turn being a remoteproc driver, and consoles as
>>> remoteproc subdevices. >
>>
>> The mailbox framework can only fit with message queues and not
>> doorbells or vCPUs.
>
> Is that so? There was a whole long drawn-out saga around the SCMI
> protocol using the Arm MHU mailbox as a set of doorbells for
> shared-memory payloads, but it did eventually get merged as the separate
> arm_mhu_db.c driver, so unless we're talking about some completely
> different notion of "doorbell"... :/
>
Doorbells will be harder to fit into mailbox API framework.
- Simple doorbells don't have any TX done acknowledgement model at
the doorbell layer (see bullet 1 from
https://lore.kernel.org/all/68e241fd-16f0-96b4-eab8-369628292e03@quicinc.com/).
Doorbell clients might have a doorbell acknowledgement flow, but the
only client I have for doorbells doesn't. IRQFDs would send an
empty message to the mailbox and immediately do a client-triggered
TX_DONE.
- Using mailboxes for the more advanced use-case doorbell forces client
to use doorbells a certain way because each channel could be a bit on
the bitmask, or the client could have complete control of the entire
bitmask. I think implementing the mailbox API would force the
otherwise-generic doorbell code to make that decision for clients.
Further, I wanted to highlight one other challenge with fitting Gunyah
message queues into mailbox API:
- Message queues track a flag which indicates whether there is space
available in the queue. The flag is returned on msgq_send. When the
message queue is full, an interrupt is raised when there is more
space available. This could be used as a TX_DONE indicator, but
mailbox framework's API prevents us from doing mbox_chan_txdone
inside the send_data channel op.
I think this might be solvable by adding a new txdone mechanism.
>> The mailbox framework also relies on the mailbox being defined in the
>> devicetree. RM is an exceptional case in that it is described in the
>> devicetree. Message queues for other VMs would be dynamically created
>> at runtime as/when that VM is created. Thus, the client of the message
>> queue would need to "own" both the controller and client ends of the
>> mailbox.
>
> FWIW, if the mailbox API does fit conceptually then it looks like it
> shouldn't be *too* hard to better abstract the DT details in the
> framework itself and allow providers to offer additional means to
> validate channel requests, which might be more productive than inventing
> a whole new thing. >
Some notes about fitting mailboxes into Gunyah IPC:
- A single mailbox controller can't cover all the gunyah devices. The
number of gunyah devices is not fixed and varies per VM launched.
Mailbox controller would need to be per-VM or per-device, where each
channel represents a capability.
- The other device types (like vCPU) don't fit into message-based
style framework. I'd like to have a consistent way of binding a
device's function with the device. If we use mailbox API, some
devices will use mailbox and others will use some other mechanism.
I'd prefer to consistently use "some other mechanism" throughout.
- TX and RX message queues are independent and "combining" a TX and RX
message queue happens at client layer by the client requesting access
to two otherwise unassociated message queues. A mailbox channel would
either be associated with a TX message queue capability or an RX
message queue capability. This isn't a major hurdle per se, but it
decreases how cleanly we can use the mailbox APIs IMO.
- A VM might only have a TX message queue and no RX message queue,
or vice versa. We won't be able to require coupling a TX and RX
message queue for the mailbox.
- TX done acknowledgement doesn't fit Gunyah IPC (see above) and a new
TX_DONE mode would need to be implemented.
- Need to make it possible for a client to binding a mailbox channel
without DT.
I'm getting a bit apprehensive about the tweaks needed to make mailbox
framework usable for Gunyah. Will there be enough code re-use and help
with abstracting the direct-to-Gunyah APIs? IMO, there isn't, but
opinions are welcome :)
Thanks,
Elliot
More information about the linux-arm-kernel
mailing list