[PATCH v3 13/13] coresight: Fix CTI module refcount leak by making it a helper device
Suzuki K Poulose
suzuki.poulose at arm.com
Tue Apr 4 06:59:53 PDT 2023
On 04/04/2023 14:04, James Clark wrote:
>
>
> On 04/04/2023 13:55, James Clark wrote:
>>
>>
>> On 04/04/2023 10:21, Suzuki K Poulose wrote:
>>> On 29/03/2023 12:53, James Clark wrote:
>>>> The CTI module has some hard coded refcounting code that has a leak.
>>>> For example running perf and then trying to unload it fails:
>>>>
>>>> perf record -e cs_etm// -a -- ls
>>>> rmmod coresight_cti
>>>>
>>>> rmmod: ERROR: Module coresight_cti is in use
>>>>
>>>> The coresight core already handles references of devices in use, so by
>>>> making CTI a normal helper device, we get working refcounting for free.
>>>>
>>>> Signed-off-by: James Clark <james.clark at arm.com>
>>>> ---
>>>> drivers/hwtracing/coresight/coresight-core.c | 99 ++++++-------------
>>>> .../hwtracing/coresight/coresight-cti-core.c | 52 +++++-----
>>>> .../hwtracing/coresight/coresight-cti-sysfs.c | 4 +-
>>>> drivers/hwtracing/coresight/coresight-cti.h | 4 +-
>>>> drivers/hwtracing/coresight/coresight-priv.h | 4 +-
>>>> drivers/hwtracing/coresight/coresight-sysfs.c | 4 +
>>>> include/linux/coresight.h | 30 +-----
>>>> 7 files changed, 70 insertions(+), 127 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c
>>>> b/drivers/hwtracing/coresight/coresight-core.c
>>>> index 65f5bd8516d8..458d91b4e23f 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>> @@ -254,60 +254,39 @@ void coresight_disclaim_device(struct
>>>> coresight_device *csdev)
>>>> }
>>>> EXPORT_SYMBOL_GPL(coresight_disclaim_device);
>>>> -/* enable or disable an associated CTI device of the supplied CS
>>>> device */
>>>> -static int
>>>> -coresight_control_assoc_ectdev(struct coresight_device *csdev, bool
>>>> enable)
>>>> -{
>>>> - int ect_ret = 0;
>>>> - struct coresight_device *ect_csdev = csdev->ect_dev;
>>>> - struct module *mod;
>>>> -
>>>> - if (!ect_csdev)
>>>> - return 0;
>>>> - if ((!ect_ops(ect_csdev)->enable) || (!ect_ops(ect_csdev)->disable))
>>>> - return 0;
>>>> -
>>>> - mod = ect_csdev->dev.parent->driver->owner;
>>>> - if (enable) {
>>>> - if (try_module_get(mod)) {
>>>> - ect_ret = ect_ops(ect_csdev)->enable(ect_csdev);
>>>> - if (ect_ret) {
>>>> - module_put(mod);
>>>> - } else {
>>>> - get_device(ect_csdev->dev.parent);
>>>> - csdev->ect_enabled = true;
>>>> - }
>>>> - } else
>>>> - ect_ret = -ENODEV;
>>>> - } else {
>>>> - if (csdev->ect_enabled) {
>>>> - ect_ret = ect_ops(ect_csdev)->disable(ect_csdev);
>>>> - put_device(ect_csdev->dev.parent);
>>>> - module_put(mod);
>>>> - csdev->ect_enabled = false;
>>>> - }
>>>> - }
>>>> -
>>>> - /* output warning if ECT enable is preventing trace operation */
>>>> - if (ect_ret)
>>>> - dev_info(&csdev->dev, "Associated ECT device (%s) %s failed\n",
>>>> - dev_name(&ect_csdev->dev),
>>>> - enable ? "enable" : "disable");
>>>> - return ect_ret;
>>>> -}
>>>> -
>>>> /*
>>>> - * Set the associated ect / cti device while holding the coresight_mutex
>>>> + * Add a helper as an output device while holding the coresight_mutex
>>>> * to avoid a race with coresight_enable that may try to use this
>>>> value.
>>>> */
>>>> -void coresight_set_assoc_ectdev_mutex(struct coresight_device *csdev,
>>>> - struct coresight_device *ect_csdev)
>>>> +void coresight_add_helper_mutex(struct coresight_device *csdev,
>>>> + struct coresight_device *helper)
>>>
>>> minor nit: It may be a good idea to rename this, in line with the
>>> kernel naming convention :
>>>
>>> coresight_add_helper_unlocked()
>>>
>>> Or if this is the only variant, it is OK to leave it as :
>>> coresight_add_helper()
>>> with a big fat comment in the function description to indicate
>>> that it takes the mutex and may be even add a :
>>>
>> There is already a bit of a comment in the description but I can expand
>> on it more.
>>
>>> might_sleep() and lockdep_assert_not_held(&coresight_mutex);
>>>
>>> in the function.
>>>
>>
>> I'm not sure if lockdep_assert_not_held() would be right because
>> sometimes it could be held if another device is being created at the
>> same time? Or something like a session is started at the same time a CTI
>> device is added.
>>
>
> Oh I see it's not for any task, it's just for the current one. That
> makes sense then I can add it.
>
> Although it looks like it only warns when lockdep is enabled, but don't
> you get a warning anyway if you try to take the lock twice with lockdep
> enabled?
Thats true, you could ignore the lockdep check.
So I'm not sure why we would add lockdep_assert_not_held() here
> and not on all the mutex_lock() calls?\
Ah. I double checked this and the coresight_mutex is static and local to
coresight-core.c. So there is no point in talking about locking for
external users. So I would just leave out any suffixes and simply use
the lockdep check implicit from mutex_lock().
Suzuki
More information about the linux-arm-kernel
mailing list