[PATCH] rpmsg: mtk_rpmsg: Fix circular locking dependency

AngeloGioacchino Del Regno angelogioacchino.delregno at collabora.com
Fri Apr 29 06:37:30 PDT 2022


Il 28/04/22 19:31, Mathieu Poirier ha scritto:
> On Fri, Jan 14, 2022 at 03:47:37PM +0100, AngeloGioacchino Del Regno wrote:
>> During execution of the worker that's used to register rpmsg devices
>> we are safely locking the channels mutex but, when creating a new
>> endpoint for such devices, we are registering a IPI on the SCP, which
>> then makes the SCP to trigger an interrupt, lock its own mutex and in
>> turn register more subdevices.
>> This creates a circular locking dependency situation, as the mtk_rpmsg
>> channels_lock will then depend on the SCP IPI lock.
>>
>> [   18.014514]  Possible unsafe locking scenario:
>> [   18.014515]        CPU0                    CPU1
>> [   18.014517]        ----                    ----
>> [   18.045467]   lock(&mtk_subdev->channels_lock);
>> [   18.045474]                                lock(&scp->ipi_desc[i].lock);
>> [   18.228399]                                lock(&mtk_subdev->channels_lock);
>> [   18.228405]   lock(&scp->ipi_desc[i].lock);
>> [   18.264405]
> 
> I finally understand the problem, something that would have been impossible
> without the pastebin you provided in your latest email.  Please add the content
> of that pastebin to the changelog and send another revision (checkpatch
> warnings can be ignored).
> 

Thanks for giving it another look... I'll add a cover letter with the contents
of the pastebin to avoid possible confusion for anyone looking at the patch.

>>
>> To solve this, simply unlock the channels_lock mutex before calling
>> mtk_rpmsg_register_device() and relock it right after, as safety is
>> still ensured by the locking mechanism that happens right after
>> through SCP.
> 
> The integrity of the subdev->channels list is guaranteed by relocking the
> mutex, I'm not sure what "through SCP" adds to the sentence.

I'll clarify the commit message.

> 
>> Notably, mtk_rpmsg_register_device() does not even require locking.
>>
> 
> I don't agree with the above sentence - if locking doesn't happen in
> mtk_rpmsg_create_device(), there can be two CPUs accessing the list at the same
> time.
> 

That's right, I have largely underestimated that for some reason, sorry about that.

Regards,
Angelo

> Thanks,
> Mathieu
> 



More information about the linux-arm-kernel mailing list