[PATCH v2 01/15] mailbox: Deprecate NULL mbox messages; Introduce mbox_ring_doorbell()
Bjorn Andersson
andersson at kernel.org
Mon Feb 9 07:40:15 PST 2026
On Sat, Feb 07, 2026 at 08:01:23PM -0800, Douglas Anderson wrote:
> The way the mailbox core behaves when you pass a NULL `mssg` parameter
> to mbox_send_message() is a little questionable. Specifically, the
> mailbox core stores the currently active message directly in its
> `active_req` field. In at least two places it decides that if this
> field is `NULL` then there is no active request. That means if `mssg`
> is ever NULL it will cause the mailbox core to think is no active
> request. The two places where it does this are:
>
> 1. When a client calls mbox_send_message(), if `active_req` is NULL
> then it will call the mailbox controller to send the new message
> even if the mailbox controller hasn't yet called mbox_chan_txdone()
> on the previous (NULL) message.
> 2. The mailbox core will never call the client's `tx_done()` callback
> with a NULL message because `tx_tick()` returns early whenever the
> message is NULL.
>
> Though the above doesn't look like it was a conscious design choice,
> it does have the benefit of providing a simple way to assert an
> edge-triggered interrupt to the remote processor on the other side of
> the mailbox. Specifically:
>
> 1. Like a normal edge-triggered interrupt, if multiple edges arrive
> before the interrupt is Acked they are coalesced.
> 2. Like a normal edge-triggered interrupt, as long as the receiver
> (the remote processor in this case) "Ack"s the interrupt _before_
> checking for work and the sender (the mailbox client in this case)
> posts the interrupt _after_ adding new work then we can always be
> certain that new work will be noticed. This assumes that the
> mailbox client and remote processor have some out-of-band way to
> communicate work and the mailbox is just being used as an
> interrupt.
>
> Doing a `git grep -A1 mbox_send_message | grep NULL` shows 14 hits in
> mainline today, but it's not 100% clear if all of those users are
> relying on the benefits/quirks of the existing behavior.
>
> Since the current NULL `mssg` behavior is a bit questionable but has
> some benefits, let's:
>
> 1. Deprecate the NULL behavior and print a warning.
> 2. Add a new mbox_ring_doorbell() function that is very similar to the
> existing NULL `mssg` case but a tad bit cleaner.
>
> The design of the new mbox_ring_doorbell() will be to maximize
> compatibility with the old NULL `mssg` behavior. Specifically:
>
> * We'll still pass NULL to the mailbox controller to indicate a
> doorbell.
> * Doorbells will not be queued and won't have txdone.
> * We'll call immediately into the mailbox controller when a doorbell
> is posted.
>
> With the above, any mailbox clients that don't mix doorbells and
> normal messages are intended to see no change in behavior when
> switching to the new API. Using the new API, which officiall documents
> that mbox_client_txdone() shouldn't be called for doorbells, does
> allow us to remove those calls.
>
> There are two differences in behavior between the old sending a NULL
> message and the new mbox_ring_doorbell():
>
> 1. If the mailbox controller returned an error when trying to send a
> NULL message, the old NULL message could have ended up being queued
> up in the core's FIFO. Now we will just return the error.
> 2. If a client rings a doorbell while a non-doorbell message is in
> progress, previously NULL messages would have been "queued" in that
> case and now doorbells will be immediately posted.
>
> I'm hoping that nobody was relying on either of the two differences.
> In general holding NULL messages in the mailbox core's queue has odd
> behavior and is hard to reason about. Hopefully it's reasonable to
> assume nobody was doing this.
>
> As mentioned above, it should be noted that it's now documented that
> "txdone" shouldn't be called (by both mailbox drivers and clients) for
> doorbells. That being said, in most cases it won't hurt since the
> mailbox core will ignore the bogus "txdone". The only case where it's
> critical for a mailbox controller not to call "txdone" for a doorbell
> is when a mailbox channel mixes normal messages and doorbells and
> cares about the txdone callback. Specifically, when you ring a
> doorbell and immediately send a normal message, if the controller
> calls "txdone" for the doorbell it could look as if the normal message
> finished before it should have. This issue also would have happened
> with the old NULL `mssg`, though.
>
> Signed-off-by: Douglas Anderson <dianders at chromium.org>
I like how this cleans up the logic hacks, but perhaps even more so how
it takes a step towards cleaning up the mailbox API when it comes to
expectations between client and provider implementations.
Reviewed-by: Bjorn Andersson <andersson at kernel.org>
Regards,
Bjorn
More information about the linux-arm-kernel
mailing list