Using the generic host PCIe driver

Ard Biesheuvel ard.biesheuvel at linaro.org
Sat Mar 4 03:45:48 PST 2017


On 4 March 2017 at 10:56, Mason <slash.tmp at free.fr> wrote:
> On 04/03/2017 10:35, Ard Biesheuvel wrote:
>> On 3 March 2017 at 23:23, Mason <slash.tmp at free.fr> wrote:
>>> On 03/03/2017 21:04, Bjorn Helgaas wrote:
>>>> On Fri, Mar 03, 2017 at 06:18:02PM +0100, Mason wrote:
>>>>> On 03/03/2017 16:46, Bjorn Helgaas wrote:
>>>>>> On Fri, Mar 03, 2017 at 01:44:54PM +0100, Mason wrote:
>>>>>>
>>>>>>> For now, I have "hidden" the root's BAR0 from the system with:
>>>>>>>
>>>>>>>    if (bus->number == 0 && where == PCI_BASE_ADDRESS_0) {
>>>>>>>            *val = 0;
>>>>>>>            return PCIBIOS_SUCCESSFUL;
>>>>>>>    }
>>>>>>
>>>>>> I'm scratching my head about this a little.  Here's what your dmesg
>>>>>> log contained originally:
>>>>>>
>>>>>>   pci 0000:00:00.0: [1105:8758] type 01 class 0x048000
>>>>>>   pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x00ffffff 64bit]
>>>>>>   pci 0000:00:00.0: BAR 0: no space for [mem size 0x01000000 64bit]
>>>>>>   pci 0000:00:00.0: BAR 0: failed to assign [mem size 0x01000000 64bit]
>>>>>>   pci 0000:00:00.0: PCI bridge to [bus 01]
>>>>>>   pcieport 0000:00:00.0: enabling device (0140 -> 0142)
>>>>>>
>>>>>> This device is a bridge (a Root Port, per your lspci output).  With a
>>>>>> BAR, which is legal but unusual.  We couldn't assign space for the
>>>>>> BAR, which means we can't use whatever vendor-specific functionality
>>>>>> it provides.
>>>>>
>>>>> I had several chats with the HW designer. I'll try to explain, only as
>>>>> far as I could understand ;-)
>>>>>
>>>>> We used to make devices, before implementing a root. Since at least
>>>>> one BAR is required (?) for a device, it was decided to have one BAR
>>>>> for the root, for symmetry.
>>>>
>>>> I'm not aware of a spec requirement for any BARs.  It's conceivable
>>>> that one could build a device that only uses config space.  And of
>>>> course, most bridges have windows but no BARs.  But that doesn't
>>>> matter; the hardware is what it is and we have to deal with it.
>>>
>>> I appreciate the compassion. RMK considered the DMA HW too screwy
>>> to bother supporting ;-)
>>>
>>>>> In fact, I thought I could ignore that BAR, but it is apparently NOT
>>>>> the case, as MSIs are supposed to be sent *within* the BAR of the root.
>>>>
>>>> I don't know much about this piece of the MSI puzzle, but maybe Marc
>>>> can enlighten us.  If this Root Port is the target of MSIs and the
>>>> Root Port turns them into some sort of interrupt on the CPU side, I
>>>> can see how this might make sense.
>>>>
>>>> I think it's unusual for the PCI core to assign the MSI target using a
>>>> BAR, though.  I think this means you'll have to implement your
>>>> arch_setup_msi_irq() or .irq_compose_msi_msg() method such that it
>>>> looks up that BAR value, since you won't know it at build-time.
>>>
>>> I'll hack the Altera driver to fit my purpose.
>>>
>>>>> The weird twist is that the BAR advertizes a 64-bit memory zone,
>>>>> but we will, in fact, map MMIO registers behind it. So all the
>>>>> RAM Linux assigns to the area is wasted, IIUC.
>>>>
>>>> I'm not sure what this means.  You have this:
>>>>
>>>>> OF: PCI:   MEM 0x90000000..0x9fffffff -> 0x90000000
>>>
>>> This means I've put 256 MB of system RAM aside for PCIe devices.
>>> This memory is no longer available for Linux "stuff".
>>>
>>
>> No it doesn't. It is a physical memory *range* that is assigned to the
>> PCI host bridge. Any memory accesses by the CPU to that window will be
>> forwarded to the PCI bus by the host bridge. From the kernel driver's
>> POV, this range is a given, but your host bridge h/w may involve some
>> configuration to make the host bridge 'listen' to this range. This is
>> h/w specific, and as Bjorn pointed out, usually configured by the
>> firmware so that the kernel driver does not require any knowledge of
>> those internals.
>>
>>>>> pci_bus 0000:00: root bus resource [mem 0x90000000-0x9fffffff]
>>>
>>> I suppose this is the PCI bus address. As we've discussed,
>>> I used the identity to map bus <-> CPU addresses.
>>>
>>
>> Yes, that is fine
>>
>>>> This [mem 0x90000000-0x9fffffff] host bridge window means there can't
>>>> be RAM in that region.  CPU accesses to 0x90000000-0x9fffffff have to
>>>> be claimed by the host bridge and forwarded to PCI.
>>>>
>>>> Linux doesn't "assign system RAM" anywhere; we just learn somehow
>>>> where that RAM is.  Linux *does* assign BARs of PCI devices, and they
>>>> have to be inside the host bridge windows(s).
>>>
>>> I'm confused, I thought I had understood that part...
>>> I thought the binding required me to specify (in the "ranges"
>>> property) a non-prefetchable zone of system RAM, and this
>>> memory is then "handed out" by Linux to different devices.
>>> Or do I just need to specify some address range that's not
>>> necessarily backed with actual RAM?
>>>
>>
>> Yes. Each PCI device advertises its need of memory windows via its
>> BARs, but the actual placement of those windows inside the host
>> bridge's memory range is configured dynamically, usually by the
>> firmware (on PCs) but on ARM/arm64 systems, this is done from scratch
>> by the kernel. The *purpose* of those memory windows is device
>> specific, but whatever is behind it lives on the PCI device. So this
>> is *not* system RAM.
>
> Hello Ard,
>
> It appears I have misunderstood something fundamental.
>
> The binding for generic PCI support
> http://lxr.free-electrons.com/source/Documentation/devicetree/bindings/pci/host-generic-pci.txt
> requires two address-type specs
> (please correct me if I'm wrong)
> 1) in the "reg" prop, the address of the configuration space (CPU physical)
> 2) in the "ranges" prop, at least a non-prefetchable area
> http://elinux.org/Device_Tree_Usage#PCI_Address_Translation
>
> In my 32-bit system, there are 2GB of RAM at [0x8000_0000,0x10000_0000[
> There are MMIO registers at [0, 16MB[ and also other stuff higher
> Suppose there is nothing mapped at [0x7000_0000, 0x8000_0000[
>
> Can I provide that range to the PCI subsystem?

Well, it obviously needs to be a range that is not otherwise occupied.
But it is SoC specific where the forwarded MEM region(s) are, and
whether they are configurable or not. IOW, you can ask *us* all you
want about these details, but only the H/W designer can answer this
for you.

The DT node that describes the host bridge should simply describe
which MMIO regions are used by the device. This is no different from
any other MMO peripheral.

As for the bus ranges: this also depends on the h/w, as far as i know,
and has a direct relation with the size of the PCI configuration space
(1 MB per bus for ECAM iirc?) On 32-bit systems, supporting that many
buses may be costly in terms of 32-bit addressable space, given that
the PCIe config space is typically below 4 GB. But it all depends on
the h/w implementation.

-- 
Ard.



More information about the linux-arm-kernel mailing list