[PATCH] firmware: arm_scmi: Queue in scmi layer for mailbox implementation

Justin Chen justin.chen at broadcom.com
Mon Oct 7 22:02:32 PDT 2024



On 10/7/2024 7:43 PM, Peng Fan wrote:
>> Subject: Re: [PATCH] firmware: arm_scmi: Queue in scmi layer for
>> mailbox implementation
>>
>>
>>
>> On 10/7/24 6:10 AM, Cristian Marussi wrote:
>>> On Mon, Oct 07, 2024 at 02:04:10PM +0100, Sudeep Holla wrote:
>>>> On Fri, Oct 04, 2024 at 03:12:57PM -0700, Justin Chen wrote:
>>>>> The mailbox layer has its own queue. However this confuses the
>> per
>>>>> message timeouts since the clock starts ticking the moment the
>>>>> messages get queued up. So all messages in the queue have there
>>>>> timeout clocks ticking instead of only the message inflight. To fix
>>>>> this, lets move the queue back into the SCMI layer.
>>>>>
>>>>
>>>> I think this has come up in the past. We have avoided adding
>> addition
>>>> locking here as the mailbox layer takes care of it. Has anything
>>>> changed recently ?
>>>
>>> I asked for an explanation in my reply (we crossed each other mails
>>> probably) since it alredy came up in the past a few times and central
>>> locking seemed not to be needed...here the difference is about the
>>> reason...Justin talks about message timeouts related to the queueing
>>> process..so I asked to better explain the detail (and the anbomaly
>>> observed) since it still does not seem to me that even in this case
>>> the lock is needed....anyway I can definitely be woring of course :D
>>>
>>
>> Hello Cristian,
>>
>> Thanks for the response. I'll try to elaborate.
>>
>> When comparing SMC and mailbox transport, we noticed mailbox
>> transport timesout much quicker when under load. Originally we
>> thought this was the latency of the mailbox implementation, but after
>> debugging we noticed a weird behavior. We saw SMCI transactions
>> timing out before the mailbox even transmitted the message.
>>
>> This issue lies in the SCMI layer. drivers/firmware/arm_scmi/driver.c
>> do_xfer() function.
>>
>> The fundamental issue is send_message() blocks for SMC transport, but
>> doesn't block for mailbox transport. So if send_message() doesn't block
>> we can have multiple messages waiting at
>> scmi_wait_for_message_response().
>>
>> SMC looks like this
>> CPU #0 SCMI message 0 -> calls send_message() then calls
>> scmi_wait_for_message_response(), timesout after 30ms.
>> CPU #1 SCMI message 1 -> blocks at send_message() waiting for SCMI
>> message 0 to complete.
>>
>> Mailbox looks like this
>> CPU #0 SCMI message 0 -> calls send_message(), mailbox layer queues
>> up message, mailbox layer sees no message is outgoing and sends it.
>> CPU waits at scmi_wait_for_message_response(), timesout after 30ms
>> CPU #1 SCMI message 1 -> calls send_message(), mailbox layer queues
>> up message, mailbox layer sees message pending, hold message in
>> queue. CPU waits at scmi_wait_for_message_response(), timesout after
>> 30ms.
>>
>> Lets say if transport takes 25ms. The first message would succeed, the
>> second message would timeout after 5ms.
> 
> Each xfer has its own completion, how the 2nd impacts the 1st?
> 
> Regards,
> Peng.
> 

Hello Peng,

The mailbox layer queues messages and doesn't block when send_message() 
is called. So we can have both xfer waiting for completion at the same time.

Lets assume a message takes 25ms to complete, with a 30ms timeout.

0ms Message #0 is queued in mailbox layer and sent out, then sits at 
scmi_wait_for_message_response() with a timeout of 30ms
1ms Message #1 is queued in mailbox layer but not sent out yet. Since 
send_message() doesn't block, it also sits at 
scmi_wait_for_message_response() with a timeout of 30ms
...
25ms Message #0 is completed, txdone is called and Message #1 is sent out
31ms Message #1 times out since the count started at 1ms. Even though it 
has only been inflight for 6ms.

Thanks,
Justin

>>
>> Hopefully this makes sense.
>>
>> Justin
>>
> 




More information about the linux-arm-kernel mailing list