[PATCH v7 08/21] iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg

Robin Murphy robin.murphy at arm.com
Thu Jun 30 07:56:56 PDT 2022


On 2022-06-29 23:41, Logan Gunthorpe wrote:
> 
> 
> On 2022-06-29 13:15, Robin Murphy wrote:
>> On 2022-06-29 16:57, Logan Gunthorpe wrote:
>>>
>>>
>>>
>>> On 2022-06-29 06:07, Robin Murphy wrote:
>>>> On 2022-06-15 17:12, Logan Gunthorpe wrote:
>>>>> When a PCI P2PDMA page is seen, set the IOVA length of the segment
>>>>> to zero so that it is not mapped into the IOVA. Then, in finalise_sg(),
>>>>> apply the appropriate bus address to the segment. The IOVA is not
>>>>> created if the scatterlist only consists of P2PDMA pages.
>>>>>
>>>>> A P2PDMA page may have three possible outcomes when being mapped:
>>>>>      1) If the data path between the two devices doesn't go through
>>>>>         the root port, then it should be mapped with a PCI bus address
>>>>>      2) If the data path goes through the host bridge, it should be
>>>>> mapped
>>>>>         normally with an IOMMU IOVA.
>>>>>      3) It is not possible for the two devices to communicate and thus
>>>>>         the mapping operation should fail (and it will return
>>>>> -EREMOTEIO).
>>>>>
>>>>> Similar to dma-direct, the sg_dma_mark_pci_p2pdma() flag is used to
>>>>> indicate bus address segments. On unmap, P2PDMA segments are skipped
>>>>> over when determining the start and end IOVA addresses.
>>>>>
>>>>> With this change, the flags variable in the dma_map_ops is set to
>>>>> DMA_F_PCI_P2PDMA_SUPPORTED to indicate support for P2PDMA pages.
>>>>>
>>>>> Signed-off-by: Logan Gunthorpe <logang at deltatee.com>
>>>>> Reviewed-by: Jason Gunthorpe <jgg at nvidia.com>
>>>>> ---
>>>>>     drivers/iommu/dma-iommu.c | 68
>>>>> +++++++++++++++++++++++++++++++++++----
>>>>>     1 file changed, 61 insertions(+), 7 deletions(-)
>>>>>
>>>>> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
>>>>> index f90251572a5d..b01ca0c6a7ab 100644
>>>>> --- a/drivers/iommu/dma-iommu.c
>>>>> +++ b/drivers/iommu/dma-iommu.c
>>>>> @@ -21,6 +21,7 @@
>>>>>     #include <linux/iova.h>
>>>>>     #include <linux/irq.h>
>>>>>     #include <linux/list_sort.h>
>>>>> +#include <linux/memremap.h>
>>>>>     #include <linux/mm.h>
>>>>>     #include <linux/mutex.h>
>>>>>     #include <linux/pci.h>
>>>>> @@ -1062,6 +1063,16 @@ static int __finalise_sg(struct device *dev,
>>>>> struct scatterlist *sg, int nents,
>>>>>             sg_dma_address(s) = DMA_MAPPING_ERROR;
>>>>>             sg_dma_len(s) = 0;
>>>>>     +        if (is_pci_p2pdma_page(sg_page(s)) && !s_iova_len) {
>>>>
>>>> Logically, should we not be able to use sg_is_dma_bus_address() here? I
>>>> think it should be feasible, and simpler, to prepare the p2p segments
>>>> up-front, such that at this point all we need to do is restore the
>>>> original length (if even that, see below).
>>>
>>> Per my previous email, no, because sg_is_dma_bus_address() is not set
>>> yet and not meant to tell you something about the page. That flag will
>>> be set below by pci_p2pdma_map_bus_segment() and then checkd in
>>> iommu_dma_unmap_sg() to determine if the dma_address in the segment
>>> needs to be unmapped.
>>
>> I know it's not set yet as-is; I'm suggesting things should be
>> restructured so that it *would be*. In the logical design of this code,
>> the DMA addresses are effectively determined in iommu_dma_map_sg(), and
>> __finalise_sg() merely converts them from a relative to an absolute form
>> (along with undoing the other trickery). Thus the call to
>> pci_p2pdma_map_bus_segment() absolutely belongs in the main
>> iommu_map_sg() loop.
> 
> I don't see how that can work: __finalise_sg() does more than convert
> them from relative to absolute, it also figures out which SG entry will
> contain which dma_address segment. Which segment a P2PDMA address needs
> to be programmed into depends on the how 'cur' is calculated which in
> turn depends on things like seg_mask and max_len. This calculation is
> not done in iommu_dma_map_sg() so I don't see how there's any hope of
> assigning the bus address for the P2P segments in that function.
> 
> If there's a way to restructure things so that's possible that I'm not
> seeing, I'm open to it but it's certainly not immediately obvious.

Huh? It's still virtually the same thing; iommu_dma_map_sg() calls 
pci_p2pdma_map_bus_segment(s) and sets s->length to 0 if 
PCI_P2PDMA_MAP_BUS_ADDR, then __finalise_sg() can use 
sg_is_dma_bus_address(s) in place of is_pci_p2pdma_page(sg_page(s)), and 
just propagate the DMA address and original length from s to cur.

Here you've written a patch which looks to correctly interrupt any 
ongoing concatenation state and convey some data from the given input 
segment to the appropriate output segment, so I'm baffled by why you'd 
think you couldn't do what you've already done.

Thanks,
Robin.



More information about the Linux-nvme mailing list