[PATCH 2/2] mailbox: mtk-cmdq: Move pm_runimte_get and put to mbox_chan_ops API

Jassi Brar jassisinghbrar at gmail.com
Mon Jun 17 11:18:17 PDT 2024


On Thu, Jun 13, 2024 at 11:01 PM Jason-JH.Lin <jason-jh.lin at mediatek.com> wrote:
>
> When we run kernel with lockdebug option, we will get the BUG below:
>   BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164
>   in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3
>     preempt_count: 1, expected: 0
>     RCU nest depth: 0, expected: 0
>     INFO: lockdep is turned off.
>     irq event stamp: 0
>     CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1
>     Hardware name: Google Ciri sku0/unprovisioned board (DT)
>     Workqueue: imgsys_runner imgsys_runner_func
>     Call trace:
>      dump_backtrace+0x100/0x120
>      show_stack+0x20/0x2c
>      dump_stack_lvl+0x84/0xb4
>      dump_stack+0x18/0x48
>      __might_resched+0x354/0x4c0
>      __might_sleep+0x98/0xe4
>      __pm_runtime_resume+0x70/0x124
>      cmdq_mbox_send_data+0xe4/0xb1c
>      msg_submit+0x194/0x2dc
>      mbox_send_message+0x190/0x330
>      imgsys_cmdq_sendtask+0x1618/0x2224
>      imgsys_runner_func+0xac/0x11c
>      process_one_work+0x638/0xf84
>      worker_thread+0x808/0xcd0
>      kthread+0x24c/0x324
>      ret_from_fork+0x10/0x20
>
> We found that there is a spin_lock_irqsave protection in msg_submit()
> of mailbox.c and it is in the atomic context.
> So when cmdq driver calls pm_runtime_get_sync() in cmdq_mbox_send_data(),
> it will get this BUG report.
>
> To avoid using sleep in atomic context, move pm_runtime_get_sync to
> mbox_chan_ops->power_get and also move pm_runtime_put_autosuspend to
> mbox_chan_ops->power_put.
>
> Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend")

Can you please share the pattern of mailbox transfers on your platform?
As in, how often and long are the "channel idle" periods? And when
active, how many transfers/sec ?
I see every TX is acked by one RX packet. How long is the typical gap
between a TX and its ack?

Thanks



More information about the linux-arm-kernel mailing list