[PATCH 1/3] of/pci/dma: fix DMA configuration for PCI masters

Oza Oza oza.oza at broadcom.com
Fri May 5 22:30:18 PDT 2017


On Fri, May 5, 2017 at 8:55 PM, Robin Murphy <robin.murphy at arm.com> wrote:
> On 04/05/17 19:41, Oza Oza wrote:
> [...]
>>>> 5) leaves scope of adding PCI flag handling for inbound memory
>>>> by the new function.
>>>
>>> Which flags would ever actually matter? DMA windows aren't going to be
>>> to config or I/O space, so the memory type can be assumed, and the
>>> 32/64-bit distinction is irrelevant as it's not a relocatable BAR;
>>> DMA-able system memory isn't going to be read-sensitive, so the
>>> prefetchable flag shouldn't matter; and not being a BAR none of the
>>> others would be relevant either.
>>>
>>
>> Thanks Robin; for your reply and attention:
>>
>> agree with you, at present it would not matter,
>> but it does not mean that we do not scope it to make it matter in future.
>>
>> now where it could matter:
>> there is Relaxed Ordering for inbound memory for PCI.
>> According to standard, Relaxed Ordering (RO) bit can be set only for
>> Memory requests and completions (if present in the original request).
>> Also, according to transaction ordering rules, I/O and configuration
>> requests can still be re-ordered ahead of each other.
>> and we would like to make use of it.
>> for e.g. lets say we mark memory as Relaxed Ordered with flag.
>> the special about this memory is incoming PCI transactions can be
>> reordered and rest memory has to be strongly ordered.
>
> Please look at "PCI Bus Binding to: IEEE Std 1275-1994 Standard for Boot
> (Initialization Configuration) Firmware" (as referenced in DTSpec) and
> explain how PCIe Relaxed Order has anything to do with the DT binding.
>
>> how it our SOC would make use of this is out of scope for the
>> discussion at this point of time, but I am just bringing in the
>> idea/point how flags could be useful
>> for inbound memory, since we would not like throw-away flags completely.
>
> The premise for implementing a PCI-specific parser is that you claim we
> need to do something with the phys.hi cell of a DT PCI address, rather
> than just taking the numerical part out of the phys.mid and phys.lo
> cells. Please make that argument in reference to the flags which that
> upper cell actually encodes, not unrelated things.
>

I think, the whole discussion around inbound flags is not what I
intended to bring.
this patch does nothing about inbound flag and never intends to solve
anything regarding inbound flags.
infact I would like to remove point 5 form the commit message. which
should keep it out of discussion completely.

let met tell what this patch is trying to address/solve 2 BUGs
1) fix wrong size return from of_dma_configure for PCI masters. (which
is right now BUG)

2) handles multiple dma-ranges cleanly

3) It takes care of dma-ranges being optional.

4) following is the comment on function of_dma_get_range (which is also a BUG)
                "It returns -ENODEV if "dma-ranges" property was not found
                 * for this device in DT."
which I think is wrong for PCI device, because if dma-ranges are
absent then instead of returning  -ENODEV,
it should return 0 with largest possible host memory.

it solves all the above 4 problems.

> [...]
>>>> +int of_pci_get_dma_ranges(struct device_node *np, struct list_head *resources)
>>>> +{
>>>> +     struct device_node *node = of_node_get(np);
>>>> +     int rlen;
>>>> +     int ret = 0;
>>>> +     const int na = 3, ns = 2;
>>>> +     struct resource *res;
>>>> +     struct of_pci_range_parser parser;
>>>> +     struct of_pci_range range;
>>>> +
>>>> +     if (!node)
>>>> +             return -EINVAL;
>>>> +
>>>> +     parser.node = node;
>>>> +     parser.pna = of_n_addr_cells(node);
>>>> +     parser.np = parser.pna + na + ns;
>>>> +
>>>> +     parser.range = of_get_property(node, "dma-ranges", &rlen);
>>>> +
>>>> +     if (!parser.range) {
>>>> +             pr_debug("pcie device has no dma-ranges defined for node(%s)\n",
>>>> +                       np->full_name);
>>>> +             ret = -EINVAL;
>>>> +             goto out;
>>>> +     }
>>>> +
>>>> +     parser.end = parser.range + rlen / sizeof(__be32);
>>>> +
>>>> +     for_each_of_pci_range(&parser, &range) {
>>>
>>> This is plain wrong - of_pci_range_parser_one() will translate upwards
>>> through parent "ranges" properties, which is completely backwards for
>>> DMA addresses.
>>>
>>> Robin.
>>>
>>
>> No it does not, this patch is thoroughly tested on our SOC and it works.
>> of_pci_range_parser_one does not translate upwards through parent. it
>> just sticks to given PCI parser.
>
> Frankly, I'm losing patience with this attitude. Please look at the code
> you call:
>
> #define for_each_of_pci_range(parser, range) \
>         for (; of_pci_range_parser_one(parser, range);)
>
>
> struct of_pci_range *of_pci_range_parser_one(struct of_pci_range_parser
> *parser,
>                                                 struct of_pci_range *range)
> {
>         const int na = 3, ns = 2;
>
>         if (!range)
>                 return NULL;
>
>         if (!parser->range || parser->range + parser->np > parser->end)
>                 return NULL;
>
>         range->pci_space = parser->range[0];
>         range->flags = of_bus_pci_get_flags(parser->range);
>         range->pci_addr = of_read_number(parser->range + 1, ns);
>         range->cpu_addr = of_translate_address(parser->node,
>                                 parser->range + na);
> ...
>
>
> u64 of_translate_address(struct device_node *dev, const __be32 *in_addr)
> {
>         return __of_translate_address(dev, in_addr, "ranges");
> }
>
>
> I don't doubt that you still manage to get the right result on *your*
> SoC, because you probably have neither further "ranges" nor "dma-ranges"
> translations above your host controller node anyway. That does not
> change the fact that the proposed code is still obviously wrong for more
> complex DT topologies that do.

sorry but I am confused, and sorry again for not getting through on
what you said.

this patch assumes that the root bus would have dma-ranges.
are you saying this code doesn't iterate through way up till it finds
valid dma-ranges ?

or

you are saying
of_pci_range_parser_one() will translate upwards
through parent "ranges" properties

>
> We're doing upstream work in core code here: I don't particularly care
> about making your SoC work; I don't really care about making Juno work
> properly either; what I do care about is that code to parse dma-ranges
> actually parses dma-ranges *correctly* for all possible valid uses of
> dma-ranges, which means fixing the existing bugs and not introducing
> more. The principal side-effect of that is that *all* systems with valid
> DTs will then work correctly.
>

I do see your point now.....and my apologies for not getting it right
at the first time.

but I would not know all the nitty-gritty of all the code of framework and
every complex DT topology of other SOCs.
that is the reason we seek for comments from experts like you, to make
the patch better.

when I say it works on our SOC, I only meant that this patch is
tested. so again apologies there.

there is one obvious problem is
of_translate_dma_address should get called instead of
of_translate_address (so no "ranges") instead ("dma-ranges")

but with that also as you said, it will traverse all the way up to the
DT hierarchy.

I think there are 2 problems with this patch.

1) this patch should try to iterate through all the way up to find
first dma-ranges and should stop when it finds it.
    it should not assume that dma-ranges will always be found at the
current node.

2) of_translate_dma_address is broken, because if point 1 is achieved,
then no need to traverse anymore.

but before that, again seeking your opinion, whether we want to go
down this path.
registering bus specific get_ranges as in PATCH v5 ? in my opinion it
is better way of handling it.

the original patch which you had in mind, you will have to club both
PCI and memory -mapped implementation together.
even if dma-ranges (ignoring flags) remain the same in nature, still
you have to parse it differently. because address-cells are different.
and you will have to handle multiple ranges.

I just tried to bring it out to different path with registering bus
specific callbacks.

> Robin.



More information about the linux-arm-kernel mailing list