[PATCH u-boot] net: phy: meson-gxl: detect LPA corruption
Neil Armstrong
narmstrong at baylibre.com
Mon Dec 18 03:59:27 PST 2017
On 12/12/2017 16:03, Neil Armstrong wrote:
> From: Jerome Brunet <jbrunet at baylibre.com>
>
> This patch is ported from the Linux patch posted at [1] and applied to
> net tree as commit f1e2400a80ff.
>
> The purpose of this change is to fix the incorrect detection of the link
> partner (LP) advertised capabilities which sometimes happens with this PHY
> (roughly 1 time in a dozen)
>
> This issue may cause the link to be negotiated at 10Mbps/Full or
> 10Mbps/Half when 100MBps/Full is actually possible. In some case, the link
> is even completely broken and no communication is possible.
>
> To detect the corruption, we must look for a magic undocumented bit in the
> WOL bank (hint given by the SoC vendor kernel) but this is not enough to
> cover all cases. We also have to look at the LPA ack. If the LP supports
> Aneg but did not ack our base code when aneg is completed, we assume
> something went wrong.
>
> The detection of a corrupted LPA triggers a restart of the aneg process.
> This solves the problem but may take up to 6 retries to complete.
>
> [1] https://lkml.kernel.org/r/20171208110811.30789-1-jbrunet@baylibre.com
>
> Fixes: 8995a96d1d67 ("net: phy: Add Amlogic Meson GXL Internal PHY support")
> Signed-off-by: Jerome Brunet <jbrunet at baylibre.com>
> Signed-off-by: Neil Armstrong <narmstrong at baylibre.com>
> ---
>
> This patch has been tested on LibreTech-CC and adapted to U-Boot PHY calls.
>
> Comments from Jerome from the original patch :
> ""
> I suppose this patch probably seems a bit hacky, especially the part
> about the link partner acknowledge. I'm trying to figure out if the
> value in MII_LPA makes sense but I don't have such a deep knowledge
> of the ethernet spec.
>
> To me, it does not makes sense for the LP to support ANEG (Bit 1 in
> MII_EXPENSION), the aneg to have successfully complete and, at the
> same time, LP does not ACK our base code word, which we should have
> sent during this aneg.
>
> If you think this may have unintended consequences or if you have
> an idea to do this differently, feel free to let me know.
>
> This fix has been tested on the libretech-cc and khadas VIM
> ""
>
> drivers/net/phy/meson-gxl.c | 88 ++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 87 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/phy/meson-gxl.c b/drivers/net/phy/meson-gxl.c
> index ccf70c9..5f4ecb1 100644
> --- a/drivers/net/phy/meson-gxl.c
> +++ b/drivers/net/phy/meson-gxl.c
> @@ -10,8 +10,94 @@
> #include <config.h>
> #include <common.h>
> #include <linux/bitops.h>
> +#include <dm.h>
> #include <phy.h>
>
> +/* This function is provided to cope with the possible failures of this phy
> + * during aneg process. When aneg fails, the PHY reports that aneg is done
> + * but the value found in MII_LPA is wrong:
> + * - Early failures: MII_LPA is just 0x0001. if MII_EXPANSION reports that
> + * the link partner (LP) supports aneg but the LP never acked our base
> + * code word, it is likely that we never sent it to begin with.
> + * - Late failures: MII_LPA is filled with a value which seems to make sense
> + * but it actually is not what the LP is advertising. It seems that we
> + * can detect this using a magic bit in the WOL bank (reg 12 - bit 12).
> + * If this particular bit is not set when aneg is reported being done,
> + * it means MII_LPA is likely to be wrong.
> + *
> + * In both case, forcing a restart of the aneg process solve the problem.
> + * When this failure happens, the first retry is usually successful but,
> + * in some cases, it may take up to 6 retries to get a decent result
> + */
> +int meson_gxl_startup(struct phy_device *phydev)
> +{
> + unsigned int retries = 10;
> + int ret, wol, lpa, exp;
> +
> +restart_aneg:
> + ret = genphy_update_link(phydev);
> + if (ret)
> + return ret;
> +
> + if (phydev->autoneg == AUTONEG_ENABLE) {
> + /* Need to access WOL bank, make sure the access is open */
> + ret = phy_write(phydev, MDIO_DEVAD_NONE, 0x14, 0x0000);
> + if (ret)
> + return ret;
> + ret = phy_write(phydev, MDIO_DEVAD_NONE, 0x14, 0x0400);
> + if (ret)
> + return ret;
> + ret = phy_write(phydev, MDIO_DEVAD_NONE, 0x14, 0x0000);
> + if (ret)
> + return ret;
> + ret = phy_write(phydev, MDIO_DEVAD_NONE, 0x14, 0x0400);
> + if (ret)
> + return ret;
> +
> + /* Request LPI_STATUS WOL register */
> + ret = phy_write(phydev, MDIO_DEVAD_NONE, 0x14, 0x8D80);
> + if (ret)
> + return ret;
> +
> + /* Read LPI_STATUS value */
> + wol = phy_read(phydev, MDIO_DEVAD_NONE, 0x15);
> + if (wol < 0)
> + return wol;
> +
> + lpa = phy_read(phydev, MDIO_DEVAD_NONE, MII_LPA);
> + if (lpa < 0)
> + return lpa;
> +
> + exp = phy_read(phydev, MDIO_DEVAD_NONE, MII_EXPANSION);
> + if (exp < 0)
> + return exp;
> +
> + if (!(wol & BIT(12)) ||
> + ((exp & EXPANSION_NWAY) && !(lpa & LPA_LPACK))) {
> +
> + /* Looks like aneg failed after all */
> + if (!retries) {
> + printf("%s LPA corruption max attempts\n",
> + phydev->dev->name);
> + return -ETIMEDOUT;
> + }
> +
> + printf("%s LPA corruption - aneg restart\n",
> + phydev->dev->name);
> +
> + ret = genphy_restart_aneg(phydev);
> + if (ret)
> + return ret;
> +
> + --retries;
> +
> + goto restart_aneg;
> + }
> + }
> +
> + return genphy_parse_link(phydev);
> +}
> +
> static int meson_gxl_phy_config(struct phy_device *phydev)
> {
> /* Enable Analog and DSP register Bank access by */
> @@ -45,7 +131,7 @@ static struct phy_driver meson_gxl_phy_driver = {
> .mask = 0xfffffff0,
> .features = PHY_BASIC_FEATURES,
> .config = &meson_gxl_phy_config,
> - .startup = &genphy_startup,
> + .startup = &meson_gxl_startup,
> .shutdown = &genphy_shutdown,
> };
>
>
Hi Tom,
This patch made the LibreTech-CC board survive 14638 reboots while
triggering 593 LPA corruption and successfully getting a DHCP answer
in each case.
Neil
More information about the linux-amlogic
mailing list