[PATCH V3 0/8] IOMMU probe deferral support

Sricharan sricharan at codeaurora.org
Tue Nov 8 22:24:20 PST 2016


Hi Robin,

>On 04/11/16 15:16, Sricharan wrote:
>> Hi Robin,
>>
>>>>> Yikes, on second look, that definitely shouldn't be happening.
>>>>> Everything below is probably the resulting fallout.
>>>>
>>>> [   40.206703] vfio-pci 0000:08:00.0: Failed to setup iommu ops
>>>>
>>>> I think the above print which says "failed to setup iommu_ops"
>>>> because the call ops->add_device failed in of_pci_iommu_configure
>>>> is the reason for the failure, in my case i simply do not get this even with
>>>> your scripts. ops->add_device succeeds in the rebind as well. So still
>>>> checking what could be happening in your case.
>>>
>>> I was looking at your code base from [1].The ops->add_device
>>> callback from of_pci_iommu_configure on the rebind is the
>>> one which is causing the failure. But not able to spot out
>>>from code which point is causing the failure. It would be very helpful
>>> if i can know which is the return value from the add_device callback
>>> or point inside add_device callback which fails in your setup.
>>>
>>>
>>> [1] git://linux-arm.org/linux-rm iommu/misc
>>
>> With little more try, i saw an issue where i had an failure
>> similar to what you reported. The issue happens when multiple
>> devices fall in to same group due to matching sids. I ended up
>> doing a fix like below and it would be nice to verify if it is the same
>> that we are seeing in your setup and if the fix makes a difference ?
>>
>> From: Sricharan R <sricharan at codeaurora.org>
>> Date: Fri, 4 Nov 2016 20:28:49 +0530
>> Subject: [PATCH] iommu/arm-smmu: Fix group's reference counting
>>
>> iommu_group_get_for_dev which gets called in the add_device
>> callback, increases the reference count of the iommu_group,
>> so we do an iommu_group_put after that. iommu_group_get_for_dev
>> inturn calls device_group callback and in the case of arm-smmu
>> we call generic_device_group/pci_device_group which takes
>> care of increasing the group's reference. But when we return
>> an already existing group(when multiple devices have same group)
>> the reference is not incremented, resulting in issues when the
>> remove_device callback for the devices is invoked.
>> Fixing the same here.
>
>Bah, yes, this does look like my fault - after flip-flopping between
>about 3 different ways to keep refcounts for the S2CR entries, none of
>which would quite work, I ripped it all out but apparently still got
>things wrong, oh well. Thanks for figuring it out.
 >
>On the probe-deferral angle, whilst it's useful to have uncovered this
>bug, I don't think we should actually be calling remove_device() from
>DMA teardown. I think it's preferable from a user perspective if group
>numbering remains stable, rather than changing depending on the order in
>which they unbind/rebind VFIO drivers. I'm really keen to try and get
>this in shape for 4.10, so I've taken the liberty of hacking up my own
>branch (iommu/defer) based on v3 - would you mind taking a look at the
>two "iommu/of:" commits to see what you think? (Ignore the PCI changes
>to your later patches - that was an experiment which didn't really work out)

Ok, will take a look at this now and respond more on this.

>
>> Signed-off-by: Sricharan R <sricharan at codeaurora.org>
>> ---
>>  drivers/iommu/arm-smmu.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index 71ce4b6..a1d0b3c 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -1516,8 +1516,10 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev)
>>  		group = smmu->s2crs[idx].group;
>>  	}
>>
>> -	if (group)
>> +	if (group) {
>> +		iommu_group_get_by_id(iommu_group_id(group));
>>  		return group;
>
>This might as well just be inline, i.e.:
>
>		return iommu_group_get_by_id(iommu_group_id(group));
>
>It's a shame we have to go all round the houses when we have the group
>right there, but this is probably the most expedient fix. I guess we can
>extend the API with some sort of iommu_group_get(group) overload in
>future if we really want to.
>

ok, i can send this fix separately then. Otherwise, Will was suggesting on the
other thread that there should probably be a separate API to increment
the group refcount or get the group from the existing aliasing device.
As per me adding the api, looks like another option or post the above ?

Regards,
 Sricharan




More information about the linux-arm-kernel mailing list