[PATCH v2] firmware: arm_scmi: Queue in scmi layer for mailbox implementation
Justin Chen
justin.chen at broadcom.com
Fri Oct 11 12:15:07 PDT 2024
On 10/11/24 6:43 AM, Cristian Marussi wrote:
> On Wed, Oct 09, 2024 at 12:26:37PM -0700, Justin Chen wrote:
>> send_message() does not block in the MBOX implementation. This is
>> because the mailbox layer has its own queue. However, this confuses
>> the per xfer timeouts as they all start their timeout ticks in
>> parallel.
>>
>> Consider a case where the xfer timeout is 30ms and a SCMI transaction
>> takes 25ms.
>>
>> 0ms: Message #0 is queued in mailbox layer and sent out, then sits
>> at scmi_wait_for_message_response() with a timeout of 30ms
>> 1ms: Message #1 is queued in mailbox layer but not sent out yet.
>> Since send_message() doesn't block, it also sits at
>> scmi_wait_for_message_response() with a timeout of 30ms
>> ...
>> 25ms: Message #0 is completed, txdone is called and Message #1 is
>> sent out
>> 31ms: Message #1 times out since the count started at 1ms. Even
>> though it has only been inflight for 6ms.
>>
>> Fixes: b53515fa177c ("firmware: arm_scmi: Make MBOX transport a standalone driver")
>> Signed-off-by: Justin Chen <justin.chen at broadcom.com>
>> ---
>>
>> Changes in v2:
>
> Hi Justin,
>
> thanks.
>
> A few nitpicks and one remark down below.
>
>>
>> - Added Fixes tag
>> - Improved commit message to better capture the issue
>>
>> .../firmware/arm_scmi/transports/mailbox.c | 21 +++++++++++++------
>> 1 file changed, 15 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/firmware/arm_scmi/transports/mailbox.c b/drivers/firmware/arm_scmi/transports/mailbox.c
>> index 1a754dee24f7..30bc2865582f 100644
>> --- a/drivers/firmware/arm_scmi/transports/mailbox.c
>> +++ b/drivers/firmware/arm_scmi/transports/mailbox.c
>> @@ -33,6 +33,7 @@ struct scmi_mailbox {
>> struct mbox_chan *chan_platform_receiver;
>> struct scmi_chan_info *cinfo;
>> struct scmi_shared_mem __iomem *shmem;
>> + struct mutex chan_lock;
>
> Missing Doxygen comment....
>
> arm_scmi/transports/mailbox.c:39: warning: Function parameter or struct member 'chan_lock' not described in 'scmi_mailbox
>
>> };
>>
>> #define client_to_scmi_mailbox(c) container_of(c, struct scmi_mailbox, cl)
>> @@ -205,6 +206,7 @@ static int mailbox_chan_setup(struct scmi_chan_info *cinfo, struct device *dev,
>> cl->rx_callback = rx_callback;
>> cl->tx_block = false;
>> cl->knows_txdone = tx;
>> + mutex_init(&smbox->chan_lock);
>
> This could be move at the end of this function after the channels are
> requested and it is no more possible to fail and bail out....messages
> wont flow and lock wont be used anyway until this chan_setup completes...
> ...BUT I have NOT string opinion about this....you can leave it here
> too...up to you
>>
>> smbox->chan = mbox_request_channel(cl, tx ? 0 : p2a_chan);
>> if (IS_ERR(smbox->chan)) {
>> @@ -267,11 +269,21 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo,
>> struct scmi_mailbox *smbox = cinfo->transport_info;
>> int ret;
>>
>> + /*
>> + * The mailbox layer has it's own queue. However the mailbox queue confuses
> its own queue
>
>> + * the per message SCMI timeouts since the clock starts when the message is
>> + * submitted into the mailbox queue. So when multiple messages are queued up
>> + * the clock starts on all messages instead of only the one inflight.
>> + */
>> + mutex_lock(&smbox->chan_lock);
>> +
>> ret = mbox_send_message(smbox->chan, xfer);
>>
>> /* mbox_send_message returns non-negative value on success, so reset */
>> if (ret > 0)
>> ret = 0;
>> + else
>> + mutex_unlock(&smbox->chan_lock);
>
> I think this should be
>
> else if (ret < 0)
> mutex_unlock(&smbox->chan_lock);
>
> ...since looking at mbox_send_message() and its implementation it returns
> NON-Negative integers on Success...so 0 from mbox_send_mmessage() also means
> SUCCESS and we should not release the mutex (I think the 'ret' returned
> here is the idx from add_to_rbuf...so it will become zero peridiocally
> on normal successfull operation)
>
Yes, I see the implementation. Looks like it returns the position in the
ring buffer. I also confirmed with CONFIG_DEBUG_MUTEXES which triggers a
warning.
What about this?
if (ret >= 0)
ret = 0
else
mutex_unlock(&smbox->chan_lock);
A bit easier to read IMO.
Thanks,
Justin
>>
>> return ret;
>> }
>> @@ -281,13 +293,10 @@ static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret,
>> {
>> struct scmi_mailbox *smbox = cinfo->transport_info;
>>
>> - /*
>> - * NOTE: we might prefer not to need the mailbox ticker to manage the
>> - * transfer queueing since the protocol layer queues things by itself.
>> - * Unfortunately, we have to kick the mailbox framework after we have
>> - * received our message.
>> - */
>> mbox_client_txdone(smbox->chan, ret);
>> +
>> + /* Release channel */
>> + mutex_unlock(&smbox->chan_lock);
>> }
>>
>> static void mailbox_fetch_response(struct scmi_chan_info *cinfo,
>> --
>> 2.34.1
>>
>
> I gave it a go on a couple of JUNO, without any issues.
>
> Other than the above, LGTM.
>
> Reviewed-by: Cristian Marussi <cristian.marussi at arm.com>
> Tested-by: Cristian Marussi <cristian.marussi at arm.com>
>
> Thanks,
> Cristian
>
More information about the linux-arm-kernel
mailing list