[PATCH v3] PCI: dw-rockchip: Enable async probe by default

Mon Mar 16 23:24:47 PDT 2026

Hi Robin,

On Sat, 14 Mar 2026 at 10:42, Anand Moon <linux.amoon at gmail.com> wrote:
>
> Hi Robin,
>
> On Fri, 13 Mar 2026 at 23:07, Robin Murphy <robin.murphy at arm.com> wrote:
> >
> > On 2026-03-13 2:39 pm, Danilo Krummrich wrote:
> > > On Fri Mar 13, 2026 at 2:15 PM CET, Robin Murphy wrote:
> > >> On 2026-03-12 12:59 pm, Danilo Krummrich wrote:
> > >>> On Thu Mar 12, 2026 at 1:48 PM CET, Robin Murphy wrote:
> > >>>> On 2026-03-11 9:09 pm, Danilo Krummrich wrote:
> > >>>>>    From a driver-core perspective I think we're rather limited on what we can do;
> > >>>>> we are already in async context at this point and can't magically go back to
> > >>>>> initcall context.
> > >>>>>
> > >>>>> So, the only thing I can think of is to kick off work on a workqueue, which in
> > >>>>> the end would be the same as the deferred probe handling.
> > >>>>
> > >>>> Hmm, in fact, isn't the deferred probe mechanism itself actually quite
> > >>>> appropriate?
> > >>>
> > >>> Yes, I've also mentioned this in [1], including the fact that it technically
> > >>> even complies with the guarantees given by PROBE_FORCE_SYNCHRONOUS. I.e. the
> > >>> documentation says:
> > >>>
> > >>>     Use this to annotate drivers that need their probe routines to run
> > >>>     synchronously with driver and device registration (with the exception of
> > >>>     -EPROBE_DEFER handling - re-probing always ends up being done
> > >>>     asynchronously).
> > >>>
> > >>> However, I'm still not sure how I feel about this, since I consider this to be
> > >>> more like a workaround that just moves things to a "more approprite" async
> > >>> context.
> > >>
> > >> I guess the underlying problem there is that there are at least 3
> > >> different significant aspects to what "synchronous" can mean:
> > >>
> > >> - literally in the context or device/driver registration as documented.
> > >> Off-hand I'm not really sure what useful property may be *specific* to
> > >> those conditions that a driver might rely on, other than for
> > >> super-special cases like platform_driver_probe().
> > >
> > > I'm not worried about this one, as it is already special by not being compatible
> > > with deferred probe. I.e. the caller already has to promise that the
> > > corresponding probe() call will never return -EPROBE_DEFER.
> > >
> > > This is a much more error prone requirement than just "don't call this from
> > > async" already.
> > >
> > >> - serialised, i.e. probes of multiple devices won't happen concurrently
> > >> on multiple threads. This is probably the one hiding the most driver
> > >> bugs, e.g. internal shared/global state without sufficient
> > >> synchronisation. I guess this falls out as a side-effect of the first
> > >> condition, but AFAICS it *can* also still provided by deferred probe
> > >> (given that it's a single work item iterating a list one-by-one)
> > >>
> > >> - in some regular thread context that isn't liable to have issues
> > >> synchronising against other async_func workers (i.e. the request_module
> > >> case). Again, deferred can't have a problem here, or it wouldn't have
> > >> worked properly in general for the last decade.
> > >>
> > >> So it's not that we'd be relying on some dubious "deferred is always
> > >> synchronous" assumption - AFAICS deferred *can* launch async if the
> > >> driver permits it - more just ratifying that deferred is still able to
> > >> effectively honour all the useful properties of PROBE_FORCE_SYNCHRONOUS
> > >> other than "during registration of the thing".
> > >
> > > Yes, and as mentioned above, EPROBE_DEFER has always been a valid path for
> > > PROBE_FORCE_SYNCHRONOUS, which is why I brought it up as a a workaround in the
> > > first place.
> > >
> > > (The reason why I still say "workaround" is because nothing actually needs to be
> > > deferred.)
> > >
> > > Anyways, as I mentioned...
> > >
> > >>> On the other hand, eventually we want everything to work with
> > >>> PROBE_PREFER_ASYNCHRONOUS, so maybe it's also good enough for the time being.
> > >
> > > ...plus there are not a lot of PROBE_FORCE_SYNCHRONOUS left anyways.
> > >
> > > Do you want to send a patch?
> >
> > Sure, I just wanted to check that you didn't see anything obviously
> > wrong with my reasoning-by-inspection - if you're sufficiently happy
> > then I'll write this up as a proper commit message and post the patch
> > (likely on Monday now) so we can all nit-pick the details :)
> >
> I believe the issue lies in the r8169 module, which currently has the
> PHY address hardcoded to zero in mdiobus_get_phy. I have modified the
> implementation to scan the MDIO bus for all possible addresses to improve
> hardware compatibility.
>
> Can you check this at your end.
>
> ----->8-----
> $ git diff drivers/net/ethernet/realtek/r8169_main.c
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c
> b/drivers/net/ethernet/realtek/r8169_main.c
> index 58788d196c57..fa99e33eccfe 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -5420,7 +5420,8 @@ static int r8169_mdio_register(struct rtl8169_private *tp)
>  {
>         struct pci_dev *pdev = tp->pci_dev;
>         struct mii_bus *new_bus;
> -       int ret;
> +       struct phy_device *phydev;
> +       int ret, addr;
>
>         /* On some boards with this chip version the BIOS is buggy and misses
>          * to reset the PHY page selector. This results in the PHY ID read
> @@ -5455,18 +5456,30 @@ static int r8169_mdio_register(struct
> rtl8169_private *tp)
>         if (ret)
>                 return ret;
>
> -       tp->phydev = mdiobus_get_phy(new_bus, 0);
> -       if (!tp->phydev) {
> +       /* find the first (lowest address) PHY on the current MAC's MII bus */
> +       for (addr = 0; addr < PHY_MAX_ADDR; addr++) {
> +               struct phy_device *tmp = mdiobus_get_phy(new_bus, addr);
> +               if (tmp) {
> +                       phydev = tmp;
> +                       break;
> +               }
> +       }
> +
> +       if (!phydev) {
> +               dev_err(&pdev->dev, "no PHY found on bus\n");
>                 return -ENODEV;
> -       } else if (!tp->phydev->drv) {
> -               /* Most chip versions fail with the genphy driver.
> -                * Therefore ensure that the dedicated PHY driver is loaded.
> -                */
> +       }
> +
> +       /* Most chip versions fail with the genphy driver.
> +        * Therefore ensure that the dedicated PHY driver is loaded.
> +        */
> +       if (!phydev->drv || phy_driver_is_genphy(phydev)) {
>                 dev_err(&pdev->dev, "no dedicated PHY driver found for
> PHY ID 0x%08x, maybe realtek.ko needs to be added to initramfs?\n",
>                         tp->phydev->phy_id);
>                 return -EUNATCH;
>         }
>
> +       tp->phydev = phydev;
>         tp->phydev->mac_managed_pm = true;
>         if (rtl_supports_eee(tp))
>                 phy_support_eee(tp->phydev);
>
> > Cheers,
> > Robin.
>
I've submitted a patch to address the issue you reported.
Could you please verify if this resolves it on your side?

[1] https://lore.kernel.org/all/20260317061700.7734-1-linux.amoon@gmail.com/

Thanks
-Anand