[PATCH V3 0/8] IOMMU probe deferral support

Sricharan sricharan at codeaurora.org
Sun Nov 20 07:11:19 PST 2016


Hi Robin,

>Hi Robin,
>
>>Hi Robin,
>>
>>>On 04/11/16 15:16, Sricharan wrote:
>>>> Hi Robin,
>>>>
>>>>>>> Yikes, on second look, that definitely shouldn't be happening.
>>>>>>> Everything below is probably the resulting fallout.
>>>>>>
>>>>>> [   40.206703] vfio-pci 0000:08:00.0: Failed to setup iommu ops
>>>>>>
>>>>>> I think the above print which says "failed to setup iommu_ops"
>>>>>> because the call ops->add_device failed in of_pci_iommu_configure
>>>>>> is the reason for the failure, in my case i simply do not get this even with
>>>>>> your scripts. ops->add_device succeeds in the rebind as well. So still
>>>>>> checking what could be happening in your case.
>>>>>
>>>>> I was looking at your code base from [1].The ops->add_device
>>>>> callback from of_pci_iommu_configure on the rebind is the
>>>>> one which is causing the failure. But not able to spot out
>>>>>from code which point is causing the failure. It would be very helpful
>>>>> if i can know which is the return value from the add_device callback
>>>>> or point inside add_device callback which fails in your setup.
>>>>>
>>>>>
>>>>> [1] git://linux-arm.org/linux-rm iommu/misc
>>>>
>>>> With little more try, i saw an issue where i had an failure
>>>> similar to what you reported. The issue happens when multiple
>>>> devices fall in to same group due to matching sids. I ended up
>>>> doing a fix like below and it would be nice to verify if it is the same
>>>> that we are seeing in your setup and if the fix makes a difference ?
>>>>
>>>> From: Sricharan R <sricharan at codeaurora.org>
>>>> Date: Fri, 4 Nov 2016 20:28:49 +0530
>>>> Subject: [PATCH] iommu/arm-smmu: Fix group's reference counting
>>>>
>>>> iommu_group_get_for_dev which gets called in the add_device
>>>> callback, increases the reference count of the iommu_group,
>>>> so we do an iommu_group_put after that. iommu_group_get_for_dev
>>>> inturn calls device_group callback and in the case of arm-smmu
>>>> we call generic_device_group/pci_device_group which takes
>>>> care of increasing the group's reference. But when we return
>>>> an already existing group(when multiple devices have same group)
>>>> the reference is not incremented, resulting in issues when the
>>>> remove_device callback for the devices is invoked.
>>>> Fixing the same here.
>>>
>>>Bah, yes, this does look like my fault - after flip-flopping between
>>>about 3 different ways to keep refcounts for the S2CR entries, none of
>>>which would quite work, I ripped it all out but apparently still got
>>>things wrong, oh well. Thanks for figuring it out.
>> >
>>>On the probe-deferral angle, whilst it's useful to have uncovered this
>>>bug, I don't think we should actually be calling remove_device() from
>>>DMA teardown. I think it's preferable from a user perspective if group
>>>numbering remains stable, rather than changing depending on the order in
>>>which they unbind/rebind VFIO drivers. I'm really keen to try and get
>>>this in shape for 4.10, so I've taken the liberty of hacking up my own
>>>branch (iommu/defer) based on v3 - would you mind taking a look at the
>>>two "iommu/of:" commits to see what you think? (Ignore the PCI changes
>>>to your later patches - that was an experiment which didn't really work out)
>>
>>Ok, will take a look at this now and respond more on this.
>>
>Sorry for the delayed response on this. I was OOO for the last few days.
>So i tested this branch and it worked fine. I tested it with a pci device
>for both normal and deferred probe cases.  The of/iommu patches
>are the cleanup/preparation patches and it looks fine. One thing is without
>calling the remove_device callback, the resources like (smes for exmaple)
>and the group association of the device all remain allocated. That does not
>feel correct, given that the associated device does not exist. So to
>understand that, what happens with VFIO in this case which makes the
>group renumbering/rebinding a problem ?
>

Would it be ok if i post a V4 based on your branch above ?

Regards,
 Sricharan




More information about the linux-arm-kernel mailing list