[PATCH] iommu/arm-smmu-qcom: NULL pointer check for driver data

Aravind Vijayakumar quic_aprasann at quicinc.com
Thu Sep 14 17:20:52 PDT 2023


On 9/8/2023 5:21 AM, Robin Murphy wrote:
> On 2023-09-08 06:17, Aravind Vijayakumar wrote:
>>
>> On 8/29/2023 7:30 AM, Rob Clark wrote:
>>> On Mon, Aug 28, 2023 at 2:35 PM Aravind Vijayakumar
>>> <quic_aprasann at quicinc.com> wrote:
>>>>
>>>> On 8/16/2023 6:01 PM, Rob Clark wrote:
>>>>> On Wed, Aug 16, 2023 at 3:55 PM Aravind Vijayakumar
>>>>> <quic_aprasann at quicinc.com> wrote:
>>>>>> The driver_data is NULL when qcom_adreno_smmu_init_context()
>>>>>> is called before the dev_set_drvdata() from the client driver
>>>>>> and is resulting in kernel crash.
>>>>>>
>>>>>> So add a null pointer check to handle the scenario
>>>>>> where the client driver for the GPU SMMU device would
>>>>>> be setting the driver data after the smmu client device
>>>>>> probe is done and not necessarily before that. The function
>>>>>> qcom_adreno_smmu_init_context() assumes that the client
>>>>>> driver always set the driver data using dev_set_drvdata()
>>>>>> before the smmu client device probe, but this assumption
>>>>>> is not always true.
>>>>>>
>>>>>> Signed-off-by: Aravind Vijayakumar <quic_aprasann at quicinc.com>
>>>>>> ---
>>>>>>    drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c | 3 +++
>>>>>>    1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c 
>>>>>> b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>> index c71afda79d64..5323f82264ca 100644
>>>>>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c
>>>>>> @@ -231,6 +231,9 @@ static int 
>>>>>> qcom_adreno_smmu_init_context(struct arm_smmu_domain *smmu_domain,
>>>>>>            */
>>>>>>
>>>>>>           priv = dev_get_drvdata(dev);
>>>>>> +       if (!priv)
>>>>>> +               return 0;
>>>>> could this -EPROBE_DEFER instead, or something like that? I think you
>>>>> patch as proposed would result in per-process gpu pgtables silently
>>>>> failing
>>>>>
>>>>> BR,
>>>>> -R
>>>> Thanks for the review comments. Returning -EPROBE_DEFER wont work
>>>> because the probe of the client driver (which sets the driver data) 
>>>> will
>>>> never get triggered. However, the probe of the client driver 
>>>> succeeds if
>>>> we return -ENODATA. would that be acceptable?
>>> I _think_ so.. I need to page back in the sequence of how this works,
>>> but I do have some warn_on's in drm/msm to complain loudly if we don't
>>> get per-process pgtables.  I'd be interested to see the callstack
>>> where you hit this issue.  From what I remember the sequence should
>>> be:
>>>
>>> 1) before the client dev probes, arm-smmu probes and attaches the
>>> dma-api managed iommu_domain (which IIRC should be an identity domain,
>>> and is otherwise unused).. at this point drvdata is NULL
>>> 2) the drm/msm can probe
>>> 3) at some point later when GPU fw is avail the GPU is loaded, drvdata
>>> is set, and we start creating and attaching the iommu_domain's that
>>> are actually used (one for kernel context and one each for userspace
>>> processes using the GPU
>>>
>>> I guess maybe if you are hitting this case of NULL drvdata, then you
>>> aren't getting an identity context for the dma-api managed
>>> iommu_domain?
>>>
>>> BR,
>>> -R
>>>
>> Yes, there are some warn_ons in io-pgtable.c, which have helped a lot 
>> during debugging. The following is the call stack when we are hitting 
>> the issue:
>>
>>    qcom_adreno_smmu_init_context+0x28/0x100
>>    arm_smmu_init_domain_context+0x1fc/0x4cc
>>    arm_smmu_attach_dev+0x7c/0x410
>>    __iommu_attach_device+0x28/0x110
>>    iommu_probe_device+0x98/0x144
>>    of_iommu_configure+0x1f0/0x278
>>    of_dma_configure_id+0x15c/0x320
>>    platform_dma_configure+0x24/0x90
>>    really_probe+0x138/0x39c
>>    __driver_probe_device+0x114/0x190
>>    device_driver_attach+0x4c/0xac
>>    bind_store+0xb8/0x110
>
> OK, so it looks like you are indeed getting a non-identity default 
> domain as Rob suspected. I guess that means qcom_smmu_client_of_match 
> needs updating for this platform? (In which case, maybe a WARN() here 
> to point in that direction might be handy as well?)
>
> Thanks,
> Robin.
>
Hi Rob and Robin,

With the NULL check and return -ENODATA, we are getting the dma-api 
managed iommu domain, and reattach the device to iommu domain by calling 
"of_dma_configure" from the driver probe, but without the NULL check it 
will be a kernel crash.

Also, with the NULL check we don't have to update the 
qcom_smmu_client_of_match and we can use the existing function - 
"qcom_adreno_smmu_init_context" itself.

Regards,
Aravind
>>
>> This is the call stack during platform_driver_register() , if there 
>> is no NULL check then the initial probe crashes, if there is NULL 
>> check, instead of crashing, the really_probe returns and we can call 
>> of_dma_configure again from the driver probe after setting the driver 
>> data. Please let me know if there is any concerns?
>>
>> Regards,
>>
>> Aravind
>>
>>>> Regards,
>>>>
>>>> Aravind
>>>>
>>>>>> +
>>>>>>           priv->cookie = smmu_domain;
>>>>>>           priv->get_ttbr1_cfg = qcom_adreno_smmu_get_ttbr1_cfg;
>>>>>>           priv->set_ttbr0_cfg = qcom_adreno_smmu_set_ttbr0_cfg;
>>>>>> -- 
>>>>>> 2.40.1
>>>>>>



More information about the linux-arm-kernel mailing list