Question about a quirky (DesignWare) PCIe RC in Allwinner H6
Icenowy Zheng
icenowy at aosc.io
Thu Mar 8 05:18:36 PST 2018
在 2018-03-08四的 12:55 +0100,Marc Gonzalez写道:
> On 08/03/2018 06:48, Icenowy Zheng wrote:
>
> > I'm trying to implement a driver for the quirky (DW) PCIe RC in the
> > Allwinner H6 SoC.
> >
> > The quirk is that only the "dbi" space is always mapped, but at
> > the
> > same time only 64KiB of other spaces (config, downstream IO and
> > non-
> > prefetchable memory) are accessible. To access a certain address
> > the
> > high 16-bit of the address (all bus addresses in H6 SoC are 32-
> > bit
> > despite the CPU is 64-bit) needs to be written into the
> > PCIE_ADDR_PAGE_CFG register (a vendor-defined register in DBI
> > space). So the access to these spaces cannot be processed
> > correctly
> > with just readl/writel, as the existing code does.
> >
> > Is it possible to workaround this in the PCI subsystem of Linux?
>
> I didn't think anyone would come up with something more broken
> than tango's PCIe host bridge...
>
> It's hard to understand what's going on inside the minds of these
> HW devs. I mean, if the steps required are
>
> WRITE MAGIC CONFIG REG
> READ/WRITE ADDRESS REG
>
> then the whole operation is clearly not atomic, and bad things happen
> when 2+ operations race.
>
> Do they think in terms of single core systems with non-multitasking
> OS?
> Probably not.
>
> Maybe they think it is not a problem to wrap the operation using a
> mutex? I'll confess I have no idea how bad that is for performance.
> However, that is not an option in Linux, because mem space accesses
> are just plain mmio accesses, and it's not possible to rewrite the
> drivers, even as an out-of-tree patch.
>
> I suppose it could be possible to make the first write "magic" in the
> sense that it could "lock" the bus until the second access is
> performed?
> Sort of like an implicit HW mutex. Actually, tango has explicit HW
> mutexes that work along these lines.
Unfortunately sun50i-h6 doesn't have it.
>
> > (I have thought a workaround that only maps the current accessible
> > 64KiB with the MMU, and when accessing the non-accessible part,
> > catch the page fault and re-setup the map to the new 64KiB page.
> > But
> > surely it will kill the performance.)
>
> The implementation might be non-trivial, as well.
It might be possible for set up a hypervisor to do it. (Although it's
surely an abuse of the HYP/EL2 mode)
>
> > [tango is] still less quirky than Allwinner H6 PCIe. It's only a
> > config/MMIO mux on tango; however on H6 PCIe both config space and
> > MMIO space are splitted to many pages. So on H6 if no solution is
> > worked out, it will not be unreliable -- it will be unusable
> > instead.
>
> I have only limited experience, but it seems that many useful PCIe
> adapters require only very little mem address space.
But there's also many that require bigger memory address space than
64K.
e.g. the Atheros AR9462 802.11n adapter needs 512K non-prefetchable
memory.
>
> For example, the USB3 PCIe adapter I tested my implementation with
> requires only 8 KB.
>
> 01:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0
> Host Controller (rev 03) (prog-if 30 [XHCI])
> Flags: fast devsel
> Memory at 50400000 (64-bit, non-prefetchable) [size=8K]
> Capabilities: [50] Power Management version 3
> Capabilities: [70] MSI: Enable- Count=1/8 Maskable- 64bit+
> Capabilities: [90] MSI-X: Enable- Count=8 Masked-
> Capabilities: [a0] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [150] Latency Tolerance Reporting
>
>
> Same thing for the following WiFi adapter.
>
> 01:00.0 Network controller: Intel Corporation Wireless 7260 (rev bb)
> Subsystem: Intel Corporation Dual Band Wireless-AC 7260
> Flags: fast devsel, IRQ 22
> Memory at 50400000 (64-bit, non-prefetchable) [disabled]
> [size=8K]
> Capabilities: [c8] Power Management version 3
> Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
> Capabilities: [40] Express Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Capabilities: [140] Device Serial Number 48-51-b7-ff-ff-84-
> 69-26
> Capabilities: [14c] Latency Tolerance Reporting
> Capabilities: [154] Vendor Specific Information: ID=cafe
> Rev=1 Len=014 <?>
>
>
> Maybe you can decide that you will support only 64 KB?
Despite support of some devices will fail, I still think this is a good
idea. It makes the problem much simpler -- it will become a mux among
config space, IO space and 64KB non-prefetchable memory.
If a PCIe-fixing hypervisor is done it can simply skip the kernel glue,
but let the kernel configure it with ECAM mode.
>
> Regards.
>
More information about the linux-arm-kernel
mailing list