[Xen-devel] [RFC] add a struct page* parameter to dma_map_ops.unmap_page

David Vrabel david.vrabel at citrix.com
Mon Nov 17 06:43:46 PST 2014


On 17/11/14 14:11, Stefano Stabellini wrote:
> Hi all,
> I am writing this email to ask for your advice.
> 
> On architectures where dma addresses are different from physical
> addresses, it can be difficult to retrieve the physical address of a
> page from its dma address.
> 
> Specifically this is the case for Xen on arm and arm64 but I think that
> other architectures might have the same issue.
> 
> Knowing the physical address is necessary to be able to issue any
> required cache maintenance operations when unmap_page,
> sync_single_for_cpu and sync_single_for_device are called.
> 
> Adding a struct page* parameter to unmap_page, sync_single_for_cpu and
> sync_single_for_device would make Linux dma handling on Xen on arm and
> arm64 much easier and quicker.

Using an opaque handle instead of struct page * would be more beneficial
for the Intel IOMMU driver.  e.g.,

typedef dma_addr_t dma_handle_t;

dma_handle_t dma_map_single(struct device *dev,
                            void *va, size_t size,
                            enum dma_data_direction dir);
void dma_unmap_single(struct device *dev,
                      dma_handle_t handle, size_t size,
                      enum dma_data_direction dir);

etc.

Drivers would then use:

dma_addr_t dma_addr(dma_handle_t handle);

To obtain the bus address from the handle.

> I think that other drivers have similar problems, such as the Intel
> IOMMU driver having to call find_iova and walking down an rbtree to get
> the physical address in its implementation of unmap_page.
> 
> Callers have the struct page* in their hands already from the previous
> map_page call so it shouldn't be an issue for them.  A problem does
> exist however: there are about 280 callers of dma_unmap_page and
> pci_unmap_page. We have even more callers of the dma_sync_single_for_*
> functions.

You will also need to fix dma_unmap_single() and pci_unmap_single()
(another 1000+ callers).

You may need to consider a parallel set of map/unmap API calls that
return/accept a handle, and then converting drivers one-by-one as
required, instead of trying to convert every single driver at once.

David



More information about the linux-arm-kernel mailing list