[PATCH v6 6/7] dma-reserved-iommu: iommu_get/put_single_reserved

Jean-Philippe Brucker Jean-Philippe.Brucker at arm.com
Thu Apr 7 07:38:53 PDT 2016


Hi Eric,

On Thu, Apr 07, 2016 at 11:33:42AM +0200, Eric Auger wrote:
> Alex,
> On 04/07/2016 01:12 AM, Alex Williamson wrote:
> > On Mon,  4 Apr 2016 08:07:01 +0000
> > Eric Auger <eric.auger at linaro.org> wrote:
> > 
> >> This patch introduces iommu_get/put_single_reserved.
> >>
> >> iommu_get_single_reserved allows to allocate a new reserved iova page
> >> and map it onto the physical page that contains a given physical address.
> >> Page size is the IOMMU page one. It is the responsability of the
> >> system integrator to make sure the in use IOMMU page size corresponds
> >> to the granularity of the MSI frame.
> >>
> >> It returns the iova that is mapped onto the provided physical address.
> >> Hence the physical address passed in argument does not need to be aligned.
> >>
> >> In case a mapping already exists between both pages, the IOVA mapped
> >> to the PA is directly returned.
> >>
> >> Each time an iova is successfully returned a binding ref count is
> >> incremented.
> >>
> >> iommu_put_single_reserved decrements the ref count and when this latter
> >> is null, the mapping is destroyed and the iova is released.
> >>
> >> Signed-off-by: Eric Auger <eric.auger at linaro.org>
> >> Signed-off-by: Ankit Jindal <ajindal at apm.com>
> >> Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar at linaro.org>
> >> Signed-off-by: Bharat Bhushan <Bharat.Bhushan at freescale.com>
> >>
> >> ---
> >>
> >> v5 -> v6:
> >> - revisit locking with spin_lock instead of mutex
> >> - do not kref_get on 1st get
> >> - add size parameter to the get function following Marc's request
> >> - use the iova domain shift instead of using the smallest supported page size
> >>
> >> v3 -> v4:
> >> - formerly in iommu: iommu_get/put_single_reserved &
> >>   iommu/arm-smmu: implement iommu_get/put_single_reserved
> >> - Attempted to address Marc's doubts about missing size/alignment
> >>   at VFIO level (user-space knows the IOMMU page size and the number
> >>   of IOVA pages to provision)
> >>
> >> v2 -> v3:
> >> - remove static implementation of iommu_get_single_reserved &
> >>   iommu_put_single_reserved when CONFIG_IOMMU_API is not set
> >>
> >> v1 -> v2:
> >> - previously a VFIO API, named vfio_alloc_map/unmap_free_reserved_iova
> >> ---
> >>  drivers/iommu/dma-reserved-iommu.c | 146 +++++++++++++++++++++++++++++++++++++
> >>  include/linux/dma-reserved-iommu.h |  28 +++++++
> >>  2 files changed, 174 insertions(+)
> >>
> >> diff --git a/drivers/iommu/dma-reserved-iommu.c b/drivers/iommu/dma-reserved-iommu.c
> >> index f592118..3c759d9 100644
> >> --- a/drivers/iommu/dma-reserved-iommu.c
> >> +++ b/drivers/iommu/dma-reserved-iommu.c
> >> @@ -136,3 +136,149 @@ void iommu_free_reserved_iova_domain(struct iommu_domain *domain)
> >>  	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> >>  }
> >>  EXPORT_SYMBOL_GPL(iommu_free_reserved_iova_domain);
> >> +
> >> +static void delete_reserved_binding(struct iommu_domain *domain,
> >> +				    struct iommu_reserved_binding *b)
> >> +{
> >> +	struct iova_domain *iovad =
> >> +		(struct iova_domain *)domain->reserved_iova_cookie;
> >> +	unsigned long order = iova_shift(iovad);
> >> +
> >> +	iommu_unmap(domain, b->iova, b->size);
> >> +	free_iova(iovad, b->iova >> order);
> >> +	kfree(b);
> >> +}
> >> +
> >> +int iommu_get_reserved_iova(struct iommu_domain *domain,
> >> +			      phys_addr_t addr, size_t size, int prot,
> >> +			      dma_addr_t *iova)
> >> +{
> >> +	struct iova_domain *iovad =
> >> +		(struct iova_domain *)domain->reserved_iova_cookie;
> >> +	unsigned long order = iova_shift(iovad);

Another nit: this call should be after the !iovad check below

> >> +	unsigned long  base_pfn, end_pfn, nb_iommu_pages;
> >> +	size_t iommu_page_size = 1 << order, binding_size;
> >> +	phys_addr_t aligned_base, offset;
> >> +	struct iommu_reserved_binding *b, *newb;
> >> +	unsigned long flags;
> >> +	struct iova *p_iova;
> >> +	bool unmap = false;
> >> +	int ret;
> >> +
> >> +	base_pfn = addr >> order;
> >> +	end_pfn = (addr + size - 1) >> order;
> >> +	nb_iommu_pages = end_pfn - base_pfn + 1;
> >> +	aligned_base = base_pfn << order;
> >> +	offset = addr - aligned_base;
> >> +	binding_size = nb_iommu_pages * iommu_page_size;
> >> +
> >> +	if (!iovad)
> >> +		return -EINVAL;
> >> +
> >> +	spin_lock_irqsave(&domain->reserved_lock, flags);
> >> +
> >> +	b = find_reserved_binding(domain, aligned_base, binding_size);
> >> +	if (b) {
> >> +		*iova = b->iova + offset;
> >> +		kref_get(&b->kref);
> >> +		ret = 0;
> >> +		goto unlock;
> >> +	}
> >> +
> >> +	spin_unlock_irqrestore(&domain->reserved_lock, flags);
> >> +
> >> +	/*
> >> +	 * no reserved IOVA was found for this PA, start allocating and
> >> +	 * registering one while the spin-lock is not held. iommu_map/unmap
> >> +	 * are not supposed to be atomic
> >> +	 */
> >> +
> >> +	p_iova = alloc_iova(iovad, nb_iommu_pages, iovad->dma_32bit_pfn, true);
> >> +	if (!p_iova)
> >> +		return -ENOMEM;
> > 
> > Here we're using iovad, which was the reserved_iova_cookie outside of
> > the locking, which makes the locking ineffective.  Isn't this lock also
> > protecting our iova domain, I'm confused.
> I think I was too when I wrote that :-(
> > 
> >> +
> >> +	*iova = iova_dma_addr(iovad, p_iova);
> >> +
> >> +	newb = kzalloc(sizeof(*b), GFP_KERNEL);
> needs to to be GPF_ATOMIC as Jean-Philippe stated before.
> >> +	if (!newb) {
> >> +		free_iova(iovad, p_iova->pfn_lo);
> >> +		return -ENOMEM;
> >> +	}
> >> +
> >> +	ret = iommu_map(domain, *iova, aligned_base, binding_size, prot);
> one problem I have is I would need iommu_map to be atomic (because of
> the call sequence reported by Jean-Philippe) and I have no guarantee it
> is in general I think . It is for arm-smmu(-v3).c which covers the ARM
> use case. But what about other smmus potentially used in that process?

Hmm, indeed. How about we move all the heavy mapping work to
msi_domain_prepare_irqs instead? It is allowed to sleep and, more
importantly, fail. It is called much earlier, by pci_enable_msi_range.

All we are missing is details about doorbells, which we could retrieve
from the MSI controller's driver, using one or more additional callbacks
in msi_domain_ops. Currently, we already need one such callbacks for
querying the number of doorbell pages, maybe we could also ask the
driver to provide their addresses? And in msi_domain_activate, simply
search for the IOVA already associated with the MSI message?

I only briefly though about it from the host point of view, not sure how
VFIO would cope with this method.

Thanks,
Jean-Philippe



More information about the linux-arm-kernel mailing list