[LEDE-DEV] imx6: fail to start IBSS link

Tim Harvey tharvey at gateworks.com
Mon Apr 3 07:03:06 PDT 2017


On Mon, Mar 6, 2017 at 7:52 AM, Koen Vandeputte
<koen.vandeputte at ncentric.com> wrote:
>
>
> On 2017-02-17 17:19, Koen Vandeputte wrote:
>>
>>
>>
>>> Koen,
>>>
>>> Can you try to disable MSI? I've seen issues with it in the past for
>>> IMX6 and I typically leave it disabled as it doesn't buy us anything
>>> and can instead hurt performance. If I recall, I think its now
>>> 'required' by the IMX6 PCIe driver so it may take a kernel change to
>>> disable it. Other than that, how does mainline 4.9 behave and what
>>> card/chipset are you using?
>>>
>>> Tim
>>
>>
>> Hi Tim,
>>
>> I will try with disabled MSI and let you know.
>> The earliest time I see in my planning is next week Friday.
>>
>> fyi, I'm testing on 3 different Ventana boards:
>>
>> - GW5100    (dualcore - single PCIe)
>> - GW5200    (dualcore - Dual PCIe)
>> - GW5410    (quadcore - 6x PCIe)
>>
>> All 3 boards utilize a single MiktroTik R11e-5HnD radio (AR 9300 based)
>>

Koen,

Sorry for the late reply - I keep getting diverted elsewhere.

When the IMX6 PCIe host controller uses MSI legacy interrupts stop
working and thus any card/driver using legacy will not have
functioning interrupts. I'm not sure what that list of card/drivers is
that require legacy interrupts but I know ath9k is one of them and
just verified it doesn't get any interrupts currently on LEDE master
with 4.9.

The Linux 4.5 kernel enables PCI_MSI by default for
imx_v6_v7_defconfig (31e98e0d24cd2537a63e06e235e050a06b175df7) and the
Linux 4.8 kernel additionally requires PCI_MSI to be enabled for IMX6
(3ee803641e76bea76ec730c80dcc64739a9919ff). I'm discussing this
upstream as I don't think MSI should be enabled on IMX6.

You can check the ath9k interrupts (grep ath9k /proc/interrupts) to
see this - if you've got 0 interrupts after your radio is up and
running you've hit this issue.

You can do the following to hack out the requirement of MSI for the
IMX6 PCIe host controller, then disable CONFIG_PCI_MSI is kernel
config
diff --git a/drivers/pci/dwc/Kconfig b/drivers/pci/dwc/Kconfig
index dfb8a69..31cf8ad 100644
--- a/drivers/pci/dwc/Kconfig
+++ b/drivers/pci/dwc/Kconfig
@@ -6,7 +6,6 @@ config PCIE_DW
 config PCIE_DW_HOST
         bool
        depends on PCI
-       depends on PCI_MSI_IRQ_DOMAIN
         select PCIE_DW

 config PCI_DRA7XX
@@ -45,7 +44,6 @@ config PCI_IMX6
        bool "Freescale i.MX6 PCIe controller"
        depends on PCI
        depends on SOC_IMX6Q
-       depends on PCI_MSI_IRQ_DOMAIN
        select PCIEPORTBUS
        select PCIE_DW_HOST



>>
>> Other issues seen so far compared to kernel 4.4:
>> - A simple "reboot" doesn't work.  UART output shows "Reboot failed" and
>> the board stalls. Powercycle is needed

This can occur on older revision boards where the PMIC is not reset on
IMX6 watchdog reset and a watchdog reset (which is what is used on
soft reboot) occurs when the CPU is above 800Mhz. Can you provide the
serial number of the board you are seeing this on and verify that if
you force the cpu to 800mhz (ie userspace cpufreq governor) prior to
reset the issue does not occur?

The work-around for this is to use the Gateworks System Controller
watchdog to restart the board which does a full board power cycle, but
I haven't had time to get that driver mainlined yet (and thus have
also not submitted it to LEDE/OpenWrt).

>> - UART DMA disabled is required to avoid some boot errors (I've made a
>> custom backport from your upstream patch fixing this, but not submitted here
>> yet)

which boot error specifically? I don't know that I've seen it, but I
can confirm that UART DMA needs to be disabled for RS485 to work
(which is a more obscure case) which is why I've done it on our
kernels. AFAIK there are still some issues upstream with IMX UART
flow-control and mctrl_gpio.

>>
>> General issues in kernels 4.4 & 4.9
>> - Even using the latest UBI FS sources + using the Sync option in bootarg,
>> files can get corrupted on a power cut.  If the corrupted file is a boot
>> file .. :)

can you point me to documentation on this bootarg, i'm not familiar with it?

>>
>>
>>
>> Other than this it runs pretty stable :)
>>
> Tim,
>
> I found 1 more issue on 4.4 & 4.9 kernels:
>
> https://lists.debian.org/debian-arm/2016/02/msg00000.html
>
> I'm also seeing this on 4.4 kernel.
> It can take up to a few days before it triggers normally, but I have a setup
> running which reproduces this within a few hours.
>
> I've made a patch which increases the timeout in the FEC driver just for
> testing .. but it still occurs causing the port to be disabled suddenly.
>

I've seen reports of this as well but usually it takes days of
activity if/before it happens. The MDIO timeout in FEC is currently
3ms - what did you increase it to and are you certain it makes these
issues go away? Perhaps we need to start a discussion about this on
linux-net. I'm not clear if an MDIO read timeout should cause an
interface to go down (or if some layer should retry). I'm also not
clear why an MDIO read would not complete in 3ms.

Tim



More information about the Lede-dev mailing list