[PATCH] iommu/arm-smmu-qcom: NULL pointer check for driver data
Aravind Vijayakumar
quic_aprasann at quicinc.com
Thu Sep 7 22:17:52 PDT 2023
On 8/29/2023 7:30 AM, Rob Clark wrote:
> On Mon, Aug 28, 2023 at 2:35 PM Aravind Vijayakumar
> <quic_aprasann at quicinc.com> wrote:
>>
>> On 8/16/2023 6:01 PM, Rob Clark wrote:
>>> On Wed, Aug 16, 2023 at 3:55 PM Aravind Vijayakumar
>>> <quic_aprasann at quicinc.com> wrote:
>>>> The driver_data is NULL when qcom_adreno_smmu_init_context()
>>>> is called before the dev_set_drvdata() from the client driver
>>>> and is resulting in kernel crash.
>>>>
>>>> So add a null pointer check to handle the scenario
>>>> where the client driver for the GPU SMMU device would
>>>> be setting the driver data after the smmu client device
>>>> probe is done and not necessarily before that. The function
>>>> qcom_adreno_smmu_init_context() assumes that the client
>>>> driver always set the driver data using dev_set_drvdata()
>>>> before the smmu client device probe, but this assumption
>>>> is not always true.
>>>>
>>>> Signed-off-by: Aravind Vijayakumar <quic_aprasann at quicinc.com>
>>>> ---
>>>> drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 3 +++
>>>> 1 file changed, 3 insertions(+)
>>>>
>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>> index c71afda79d64..5323f82264ca 100644
>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>> @@ -231,6 +231,9 @@ static int qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>>> */
>>>>
>>>> priv = dev_get_drvdata(dev);
>>>> + if (!priv)
>>>> + return 0;
>>> could this -EPROBE_DEFER instead, or something like that? I think you
>>> patch as proposed would result in per-process gpu pgtables silently
>>> failing
>>>
>>> BR,
>>> -R
>> Thanks for the review comments. Returning -EPROBE_DEFER wont work
>> because the probe of the client driver (which sets the driver data) will
>> never get triggered. However, the probe of the client driver succeeds if
>> we return -ENODATA. would that be acceptable?
> I _think_ so.. I need to page back in the sequence of how this works,
> but I do have some warn_on's in drm/msm to complain loudly if we don't
> get per-process pgtables. I'd be interested to see the callstack
> where you hit this issue. From what I remember the sequence should
> be:
>
> 1) before the client dev probes, arm-smmu probes and attaches the
> dma-api managed iommu_domain (which IIRC should be an identity domain,
> and is otherwise unused).. at this point drvdata is NULL
> 2) the drm/msm can probe
> 3) at some point later when GPU fw is avail the GPU is loaded, drvdata
> is set, and we start creating and attaching the iommu_domain's that
> are actually used (one for kernel context and one each for userspace
> processes using the GPU
>
> I guess maybe if you are hitting this case of NULL drvdata, then you
> aren't getting an identity context for the dma-api managed
> iommu_domain?
>
> BR,
> -R
>
Yes, there are some warn_ons in io-pgtable.c, which have helped a lot
during debugging. The following is the call stack when we are hitting
the issue:
qcom_adreno_smmu_init_context+0x28/0x100
arm_smmu_init_domain_context+0x1fc/0x4cc
arm_smmu_attach_dev+0x7c/0x410
__iommu_attach_device+0x28/0x110
iommu_probe_device+0x98/0x144
of_iommu_configure+0x1f0/0x278
of_dma_configure_id+0x15c/0x320
platform_dma_configure+0x24/0x90
really_probe+0x138/0x39c
__driver_probe_device+0x114/0x190
device_driver_attach+0x4c/0xac
bind_store+0xb8/0x110
This is the call stack during platform_driver_register() , if there is
no NULL check then the initial probe crashes, if there is NULL check,
instead of crashing, the really_probe returns and we can call
of_dma_configure again from the driver probe after setting the driver
data. Please let me know if there is any concerns?
Regards,
Aravind
>> Regards,
>>
>> Aravind
>>
>>>> +
>>>> priv->cookie = smmu_domain;
>>>> priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
>>>> priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
>>>> --
>>>> 2.40.1
>>>>
More information about the linux-arm-kernel
mailing list