[PATCH] firmware: arm_scmi: Queue in scmi layer for mailbox implementation
Sudeep Holla
sudeep.holla at arm.com
Tue Oct 8 06:17:00 PDT 2024
On Mon, Oct 07, 2024 at 10:58:47AM -0700, Justin Chen wrote:
>
>
> On 10/7/24 6:10 AM, Cristian Marussi wrote:
> > On Mon, Oct 07, 2024 at 02:04:10PM +0100, Sudeep Holla wrote:
> > > On Fri, Oct 04, 2024 at 03:12:57PM -0700, Justin Chen wrote:
> > > > The mailbox layer has its own queue. However this confuses the per
> > > > message timeouts since the clock starts ticking the moment the messages
> > > > get queued up. So all messages in the queue have there timeout clocks
> > > > ticking instead of only the message inflight. To fix this, lets move the
> > > > queue back into the SCMI layer.
> > > >
> > >
> > > I think this has come up in the past. We have avoided adding addition
> > > locking here as the mailbox layer takes care of it. Has anything changed
> > > recently ?
> >
> > I asked for an explanation in my reply (we crossed each other mails probably)
> > since it alredy came up in the past a few times and central locking seemed not
> > to be needed...here the difference is about the reason...Justin talks about
> > message timeouts related to the queueing process..so I asked to better
> > explain the detail (and the anbomaly observed) since it still does not
> > seem to me that even in this case the lock is needed....anyway I can
> > definitely be woring of course :D
> >
>
> Hello Cristian,
>
> Thanks for the response. I'll try to elaborate.
>
> When comparing SMC and mailbox transport, we noticed mailbox transport
> timesout much quicker when under load. Originally we thought this was the
> latency of the mailbox implementation, but after debugging we noticed a
> weird behavior. We saw SMCI transactions timing out before the mailbox even
> transmitted the message.
>
> This issue lies in the SCMI layer. drivers/firmware/arm_scmi/driver.c
> do_xfer() function.
>
> The fundamental issue is send_message() blocks for SMC transport, but
> doesn't block for mailbox transport. So if send_message() doesn't block we
> can have multiple messages waiting at scmi_wait_for_message_response().
>
> SMC looks like this
> CPU #0 SCMI message 0 -> calls send_message() then calls
> scmi_wait_for_message_response(), timesout after 30ms.
> CPU #1 SCMI message 1 -> blocks at send_message() waiting for SCMI message 0
> to complete.
>
> Mailbox looks like this
> CPU #0 SCMI message 0 -> calls send_message(), mailbox layer queues up
> message, mailbox layer sees no message is outgoing and sends it. CPU waits
> at scmi_wait_for_message_response(), timesout after 30ms
> CPU #1 SCMI message 1 -> calls send_message(), mailbox layer queues up
> message, mailbox layer sees message pending, hold message in queue. CPU
> waits at scmi_wait_for_message_response(), timesout after 30ms.
>
> Lets say if transport takes 25ms. The first message would succeed, the
> second message would timeout after 5ms.
>
I understand this issue and was aware of this. Just had assumed it won't
have such a impact as I was assuming transport to take couple of ms for
any synchronous commands.
Typically I would expect transport to take much less that 25ms(1-4ms IMO)
to complete any synchronous commands. If it takes 25ms for sync command,
then it could be some serious issue.
Anyways I am not against the idea. More details in response to Cristian.
--
Regards,
Sudeep
More information about the linux-arm-kernel
mailing list