[RFC PATCH 4/4] iommu: make IOVA domain page size explicit
leizhen
thunder.leizhen at huawei.com
Wed Nov 26 23:10:50 PST 2014
On 2014/11/26 21:31, Robin Murphy wrote:
> Hi,
>
> On 26/11/14 07:17, leizhen wrote:
> [...]
>>> diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
>>> index a3dbba8..9e50625 100644
>>> --- a/drivers/iommu/iova.c
>>> +++ b/drivers/iommu/iova.c
>>> @@ -55,12 +55,16 @@ void free_iova_mem(struct iova *iova)
>>> }
>>>
>>> void
>>> -init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> - unsigned long pfn_32bit)
>>> +init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> + unsigned long start_pfn, unsigned long pfn_32bit)
>>> {
>>> + /* IOMMUs must support at least the CPU page size */
>>> + BUG_ON(granule > PAGE_SIZE);
>>
>> ??? BUG_ON(granule < PAGE_SIZE);
>>
>
> Granule represents the smallest page size that the IOMMU can map - we'd never allocate less IOVA space than a single page, so we don't need a unit smaller than that, and a bigger unit would make it
> problematic to allocate IOVA space for individual pages.
>
> If, for instance, the CPU has 64k pages, and the IOMMU supports 1k tiny pages, granule would be less than PAGE_SIZE, but the IOMMU could happily that 64k as 64 contiguous tiny pages, even if it's a
> waste of TLB space.
>
> However, say the IOMMU only supports 16k pages and we try to map two noncontiguous 4k CPU pages as a contiguous 8k buffer in IOVA space - this can't work, hence why the _minimum_ IOMMU page size, and
> thus granule, may never be _bigger_ than a CPU page.
>
Oh, I just confused with the comment: /* IOMMUs must support at least the CPU page size */
But the code means at most the CPU page size. Now, I got it.
>>> +
>>> spin_lock_init(&iovad->iova_rbtree_lock);
>>> iovad->rbroot = RB_ROOT;
>>> iovad->cached32_node = NULL;
>>> + iovad->granule = granule;
>>
>> Do you need to check granule is 2^n ?
>>
>
> Yes, that would make a lot of sense.
>
>>> iovad->start_pfn = start_pfn;
>>> iovad->dma_32bit_pfn = pfn_32bit;
>>> }
>>> diff --git a/include/linux/iova.h b/include/linux/iova.h
>>> index 591b196..3920a19 100644
>>> --- a/include/linux/iova.h
>>> +++ b/include/linux/iova.h
>>> @@ -28,6 +28,7 @@ struct iova_domain {
>>> spinlock_t iova_rbtree_lock; /* Lock to protect update of rbtree */
>>> struct rb_root rbroot; /* iova domain rbtree root */
>>> struct rb_node *cached32_node; /* Save last alloced node */
>>> + unsigned long granule; /* pfn granularity for this domain */
>>> unsigned long start_pfn; /* Lower limit for this domain */
>>> unsigned long dma_32bit_pfn;
>>> };
>>> @@ -37,6 +38,36 @@ static inline unsigned long iova_size(struct iova *iova)
>>> return iova->pfn_hi - iova->pfn_lo + 1;
>>> }
>>>
>>> +static inline unsigned long iova_shift(struct iova_domain *iovad)
>>> +{
>>> + return __ffs(iovad->granule);
>>
>> I think it's good to save the result to a struct member. If we caculate it ervertime, it's too slow.
>>
>
> I did give some consideration at the time to which of size, shift and mask to store - given the redundancy between them storing more than one seems excessively wasteful - and this made for the
> nicest-looking code IMO. These IOVA operations are going to be mostly used in map/unmap situations leading to an IOMMU TLB flush anyway, so I'm not sure how performance-critical they really are.
>
> Are we likely to see this code used on architectures where __ffs takes more than a few cycles and has more impact than the general memory bloat of making structures bigger?
>
>
> Thanks for the review,
> Robin.
>
>>> +}
>>> +
>>> +static inline unsigned long iova_mask(struct iova_domain *iovad)
>>> +{
>>> + return iovad->granule - 1;
>>> +}
>>> +
>>> +static inline size_t iova_offset(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> + return iova & iova_mask(iovad);
>>> +}
>>> +
>>> +static inline size_t iova_align(struct iova_domain *iovad, size_t size)
>>> +{
>>> + return ALIGN(size, iovad->granule);
>>> +}
>>> +
>>> +static inline dma_addr_t iova_dma_addr(struct iova_domain *iovad, struct iova *iova)
>>> +{
>>> + return (dma_addr_t)iova->pfn_lo << iova_shift(iovad);
>>> +}
>>> +
>>> +static inline unsigned long iova_pfn(struct iova_domain *iovad, dma_addr_t iova)
>>> +{
>>> + return iova >> iova_shift(iovad);
>>> +}
>>> +
>>> int iommu_iova_cache_init(void);
>>> void iommu_iova_cache_destroy(void);
>>>
>>> @@ -50,8 +81,8 @@ struct iova *alloc_iova(struct iova_domain *iovad, unsigned long size,
>>> struct iova *reserve_iova(struct iova_domain *iovad, unsigned long pfn_lo,
>>> unsigned long pfn_hi);
>>> void copy_reserved_iova(struct iova_domain *from, struct iova_domain *to);
>>> -void init_iova_domain(struct iova_domain *iovad, unsigned long start_pfn,
>>> - unsigned long pfn_32bit);
>>> +void init_iova_domain(struct iova_domain *iovad, unsigned long granule,
>>> + unsigned long start_pfn, unsigned long pfn_32bit);
>>> struct iova *find_iova(struct iova_domain *iovad, unsigned long pfn);
>>> void put_iova_domain(struct iova_domain *iovad);
>>> struct iova *split_and_remove_iova(struct iova_domain *iovad,
>>>
>>
>>
>>
>
>
>
> .
>
More information about the linux-arm-kernel
mailing list