Help: i.MX8MP AUDIOMIX BLK-CTRL CLK driver support
Peng Fan
peng.fan at oss.nxp.com
Tue Sep 27 02:36:07 PDT 2022
Hi Lucas,
On 9/23/2022 4:18 PM, Lucas Stach wrote:
> Hi Peng,
>
> Am Freitag, dem 23.09.2022 um 11:41 +0800 schrieb Peng Fan:
>> Hi All,
>>
>> I would start a discussion about the A/B B/A lock issue when make
>> audiomix blk ctrl function as clk provider.
>>
>> I not have good idea on this, hope you have any suggestions.
>>
>> major issue is: the blk ctrl clk has a power domain supplier
>> The power domain supplier also use clk API to prepare_enable clks.
>> The blk ctrl clk driver has runtime pm enabled.
>>
>> The NXP downstream:
>> The dts:
>> https://github.com/nxp-imx/linux-imx/blob/lf-5.15.y/arch/arm64/boot/dts/freescale/imx8mp.dtsi#L1872
>>
>> The driver:
>> https://github.com/nxp-imx/linux-imx/blob/lf-5.15.y/drivers/clk/imx/clk-imx8mp.c
>> https://github.com/nxp-imx/linux-imx/blob/lf-5.15.y/drivers/clk/imx/clk-blk-ctrl.c
>>
>> Note: The following log was reproduced with NXP downstream. For upstream
>> I think we have similar issue if we still take audiomix blk ctrl as clk
>> driver. Because the gpcv2 also use clk prepare enable API.
>> 1. deadlock 1:
>> Callchain after enable some lock debug:
>> clk_ignore_unused will use lock seq: take prepare lock, take blk-ctrl
>> parent power domain genpd lock
>> genpd_power_off_work_fun will use lock seq: take the power domain genpd
>> lock, take prepare lock.
>>
>> clk_disable_unused:
>> [ 11.667711][ T108] -> #1 (&genpd->mlock){+.+.}-{3:3}:
>> [ 11.675041][ T108] __lock_acquire+0xae4/0xef8
>> [ 11.680093][ T108] lock_acquire+0xfc/0x2f8
>> [ 11.684888][ T108] __mutex_lock+0x90/0x870
>> [ 11.689685][ T108] mutex_lock_nested+0x44/0x50
>> [ 11.694826][ T108] genpd_lock_mtx+0x18/0x24
>> [ 11.699706][ T108] genpd_runtime_resume+0x90/0x214 (hold
>> genpd->mlock)
>> [ 11.705194][ T108] __rpm_callback+0x80/0x2c0
>> [ 11.710160][ T108] rpm_resume+0x468/0x650
>> [ 11.714866][ T108] __pm_runtime_resume+0x60/0x88
>> [ 11.720180][ T108] clk_pm_runtime_get+0x28/0x9c
>> [ 11.725410][ T108] clk_disable_unused_subtree+0x8c/0x144
>> [ 11.731420][ T108] clk_disable_unused_subtree+0x124/0x144
>> [ 11.737518][ T108] clk_disable_unused+0xa4/0x11c (hold
>> prepare_lock)
>> [ 11.742833][ T108] do_one_initcall+0x98/0x178
>> [ 11.747888][ T108] do_initcall_level+0x9c/0xb8
>> [ 11.753028][ T108] do_initcalls+0x54/0x94
>> [ 11.757736][ T108] do_basic_setup+0x24/0x30
>> [ 11.762614][ T108] kernel_init_freeable+0x70/0xa4
>> [ 11.768014][ T108] kernel_init+0x14/0x18c
>> [ 11.772722][ T108] ret_from_fork+0x10/0x18
>> [ 11.777512][ T108] -> #0 (prepare_lock){+.+.}-{3:3}:
>> [ 11.784749][ T108] check_noncircular+0x134/0x13c
>> [ 11.790064][ T108] validate_chain+0x590/0x2a04
>> [ 11.795204][ T108] __lock_acquire+0xae4/0xef8
>> [ 11.800258][ T108] lock_acquire+0xfc/0x2f8
>> [ 11.805050][ T108] __mutex_lock+0x90/0x870
>> [ 11.809841][ T108] mutex_lock_nested+0x44/0x50
>> [ 11.814983][ T108] clk_unprepare+0x5c/0x100 ((hold prepare_lock))
>> [ 11.819864][ T108] imx8m_pd_power_off+0xac/0x110
>> [ 11.825179][ T108] genpd_power_off+0x1b4/0x2dc
>> [ 11.830318][ T108] genpd_power_off_work_fn+0x38/0x58 (hold
>> genpd->mlock)
>> [ 11.835981][ T108] process_one_work+0x270/0x444
>> [ 11.841208][ T108] worker_thread+0x280/0x4e4
>> [ 11.846176][ T108] kthread+0x13c/0x14c
>> [ 11.850621][ T108] ret_from_fork+0x10/0x18
>>
>>
>> 2:
>> another mutex dead lock between deferred_probe_work_func/
>> genpd_power_off_work_fn
>>
>> The sequency in deferred_probe_work_func is lock the clk firstly, than
>> hold pd secondly.
>>
>>
>> [ 11.351159][ T46] the existing dependency chain (in reverse order) is:
>> [ 11.351162][ T46]
>> [ 11.351162][ T46] -> #1 (&genpd->mlock){+.+.}-{3:3}:
>> [ 11.351176][ T46] __lock_acquire+0xae4/0xef8
>> [ 11.351180][ T46] lock_acquire+0xfc/0x2f8
>> [ 11.351185][ T46] __mutex_lock+0x90/0x870
>> [ 11.351188][ T46] mutex_lock_nested+0x44/0x50
>> [ 11.351192][ T46] genpd_lock_mtx+0x18/0x24
>> [ 11.351196][ T46] genpd_runtime_suspend+0x1dc/0x224 (hold pd
>> lock)
>> [ 11.351201][ T46] __rpm_callback+0x80/0x2c0
>> [ 11.351205][ T46] rpm_suspend+0x2a4/0x62c
>> [ 11.351208][ T46] rpm_idle+0x158/0x228
>> [ 11.351212][ T46] __pm_runtime_idle+0x64/0xac
>> [ 11.351217][ T46] clk_change_rate+0x400/0x494
>> (clk_pm_runtime_put()
>> [ 11.351220][ T46] clk_change_rate+0x438/0x494
>> [ 11.351224][ T46] clk_core_set_rate_nolock+0x22c/0x2d8
>> [ 11.351228][ T46] clk_set_rate+0x94/0x134 (hold prepared lock)
>> [ 11.351232][ T46] of_clk_set_defaults+0x27c/0x364
>> [ 11.351236][ T46] platform_drv_probe+0x28/0xbc
>> [ 11.351241][ T46] really_probe+0x1dc/0x4c0
>> [ 11.351245][ T46] driver_probe_device+0x7c/0xb8
>> [ 11.351249][ T46] __device_attach_driver+0x118/0x13c
>> [ 11.351253][ T46] bus_for_each_drv+0x80/0xcc
>> [ 11.351257][ T46] __device_attach+0xd0/0x174
>> [ 11.351261][ T46] device_initial_probe+0x14/0x20
>> [ 11.351265][ T46] bus_probe_device+0x34/0x9c
>> [ 11.351269][ T46] deferred_probe_work_func+0x64/0xc4
>> [ 11.351274][ T46] process_one_work+0x270/0x444
>> [ 11.351278][ T46] worker_thread+0x280/0x4e4
>> [ 11.351288][ T46] kthread+0x13c/0x14c
>> [ 11.529876][ T46] ret_from_fork+0x10/0x18
>>
>
> There are two possible solutions here:
>
> 1. Make it a contract between the blk-ctrl and the devices put into
> blk-ctrl power domains that they first need to RPM resume then call the
> clock functions, never call clock clock functions when the device is
> suspended. This way the GPC domain is already up when any clock
> manipulation is done, so the GPC will not need to take any clock
> mutexes. Downside to this is that drivers then need to adhere to this
> contract when they are in a blk-ctrl domain, which deviates from the
> usual rules, so might take some driver writers by surprise if they
> aren't aware of this.
>
> 2. Move the bus/reset clocks from the GPC domain to the blk-ctrl. In
> the current design blk-ctrl has full control over the sequencing of the
> GPC power up, so it would be possible to enable the bus/reset clocks
> from the blk-ctrl driver, then power up the GPC domain. As the task
> doing the the bus/reset clk enable would then be the same that already
> has the clk_prepare mutex there will be no deadlock, as the clk
> framework alows this kind of recursive locking.
Thanks for sharing thoughts.
option 2 seems good to me, option 1 requires more changes in consumer
drivers side.
Marek,
Do you also have time to give a look?
Thanks,
Peng.
>
> Regards,
> Lucas
>
More information about the linux-arm-kernel
mailing list