[PATCH] coresight: Fix possible deadlock with lock dependency

Mike Leach mike.leach at linaro.org
Mon Aug 15 07:48:20 PDT 2022


Hi Sudeep,

On Thu, 21 Jul 2022 at 14:03, Sudeep Holla <sudeep.holla at arm.com> wrote:
>
> With lockdeps enabled, we get the following warning:
>
> ======================================================
> WARNING: possible circular locking dependency detected
> ------------------------------------------------------
> kworker/u12:1/53 is trying to acquire lock:
> ffff80000adce220 (coresight_mutex){+.+.}-{4:4}, at: coresight_set_assoc_ectdev_mutex+0x3c/0x5c
> but task is already holding lock:
> ffff80000add1f60 (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> which lock already depends on the new lock.
> the existing dependency chain (in reverse order) is:
>
> -> #1 (ect_mutex){+.+.}-{4:4}:
>        __mutex_lock_common+0xd8/0xe60
>        mutex_lock_nested+0x44/0x50
>        cti_add_assoc_to_csdev+0x4c/0x184
>        coresight_register+0x2f0/0x314
>        tmc_probe+0x33c/0x414
>
> -> #0 (coresight_mutex){+.+.}-{4:4}:
>        __lock_acquire+0x1a20/0x32d0
>        lock_acquire+0x160/0x308
>        __mutex_lock_common+0xd8/0xe60
>        mutex_lock_nested+0x44/0x50
>        coresight_set_assoc_ectdev_mutex+0x3c/0x5c
>        cti_update_conn_xrefs+0x6c/0xf8
>        cti_probe+0x33c/0x394
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>        CPU0                    CPU1
>        ----                    ----
>   lock(ect_mutex);
>                                lock(coresight_mutex);
>                                lock(ect_mutex);
>   lock(coresight_mutex);
>  *** DEADLOCK ***
>
> 4 locks held by kworker/u12:1/53:
>  #0: ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x1fc/0x63c
>  #1: (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x228/0x63c
>  #2: (&dev->mutex){....}-{4:4}, at: __device_attach+0x48/0x1a8
>  #3: (ect_mutex){+.+.}-{4:4}, at: cti_probe+0x318/0x394
>
> To fix the same, call cti_add_assoc_to_csdev without the holding
> coresight_mutex and confine the locking while setting the associated
> ect / cti device using coresight_set_assoc_ectdev_mutex().
>
> Cc: Mathieu Poirier <mathieu.poirier at linaro.org>
> Cc: Suzuki K Poulose <suzuki.poulose at arm.com>
> Cc: Mike Leach <mike.leach at linaro.org>
> Cc: Leo Yan <leo.yan at linaro.org>
> Signed-off-by: Sudeep Holla <sudeep.holla at arm.com>
> ---
>  drivers/hwtracing/coresight/coresight-core.c     | 7 ++++---
>  drivers/hwtracing/coresight/coresight-cti-core.c | 5 +++--
>  2 files changed, 7 insertions(+), 5 deletions(-)
>
> Hi,
>
> I am not sure if I missed to consider something, but this is something
> I think could be the fix.
>
> Regards,
> Sudeep
>

I have been running with lockdep testing coresight for a number of
weeks now, without encountering the same issue with CTIs.
That said, I am on a DB410, and the code in question is designed to
take different paths depending on whether the CTI or other device is
discovered first, so this could well be something different in the
discovery order between juno and DB410.

The change looks perfectly fine to me.

As such:
Reviewed-by: Mike Leach <mike.leach at linaro.org>

> diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
> index 1edfec1e9d18..275b4dce8401 100644
> --- a/drivers/hwtracing/coresight/coresight-core.c
> +++ b/drivers/hwtracing/coresight/coresight-core.c
> @@ -1659,14 +1659,15 @@ struct coresight_device *coresight_register(struct coresight_desc *desc)
>                 ret = coresight_fixup_device_conns(csdev);
>         if (!ret)
>                 ret = coresight_fixup_orphan_conns(csdev);
> -       if (!ret && cti_assoc_ops && cti_assoc_ops->add)
> -               cti_assoc_ops->add(csdev);
>
>  out_unlock:
>         mutex_unlock(&coresight_mutex);
>         /* Success */
> -       if (!ret)
> +       if (!ret) {
> +               if (cti_assoc_ops && cti_assoc_ops->add)
> +                       cti_assoc_ops->add(csdev);
>                 return csdev;
> +       }
>
>         /* Unregister the device if needed */
>         if (registered) {
> diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c
> index 8988b2ed2ea6..1be92342b5b9 100644
> --- a/drivers/hwtracing/coresight/coresight-cti-core.c
> +++ b/drivers/hwtracing/coresight/coresight-cti-core.c
> @@ -541,7 +541,7 @@ cti_match_fixup_csdev(struct cti_device *ctidev, const char *node_name,
>  /*
>   * Search the cti list to add an associated CTI into the supplied CS device
>   * This will set the association if CTI declared before the CS device.
> - * (called from coresight_register() with coresight_mutex locked).
> + * (called from coresight_register() without coresight_mutex locked).
>   */
>  static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
>  {
> @@ -569,7 +569,8 @@ static void cti_add_assoc_to_csdev(struct coresight_device *csdev)
>                          * if we found a matching csdev then update the ECT
>                          * association pointer for the device with this CTI.
>                          */
> -                       csdev->ect_dev = ect_item->csdev;
> +                       coresight_set_assoc_ectdev_mutex(csdev->ect_dev,
> +                                                        ect_item->csdev);
>                         break;
>                 }
>         }
> --
> 2.37.1
>


-- 
Mike Leach
Principal Engineer, ARM Ltd.
Manchester Design Centre. UK



More information about the linux-arm-kernel mailing list