[PATCH v2 net 2/5] net: dsa: be compatible with masters which unregister on shutdown

Fri Sep 13 13:29:16 PDT 2024

Hi Alexander,

On Wed, Sep 04, 2024 at 08:31:13AM +0000, Sverdlin, Alexander wrote:
> > +static void lan9303_mdio_shutdown(struct mdio_device *mdiodev)
> > +{
> > +	struct lan9303_mdio *sw_dev = dev_get_drvdata(&mdiodev->dev);
> > +
> > +	if (!sw_dev)
> > +		return;
> > +
> > +	lan9303_shutdown(&sw_dev->chip);
> > +
> > +	dev_set_drvdata(&mdiodev->dev, NULL);
> >  }
> 
> This unfortunately didn't work well with LAN9303 and probably will not work
> with others:
> 
> Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> CPU: 0 PID: 442 Comm: kworker/0:3 Tainted: G           O       6.1.99+gitb7793b7d9b35 #1
> Workqueue: events_power_efficient phy_state_machine
> pc : lan9303_mdio_phy_read+0x1c/0x34
> lr : lan9303_phy_read+0x50/0x100
> Call trace:
>  lan9303_mdio_phy_read+0x1c/0x34
>  lan9303_phy_read+0x50/0x100
>  dsa_slave_phy_read+0x40/0x50
>  __mdiobus_read+0x34/0x130
>  mdiobus_read+0x44/0x70
>  genphy_update_link+0x2c/0x104
>  genphy_read_status+0x2c/0x120
>  phy_check_link_status+0xb8/0xcc
>  phy_state_machine+0x198/0x27c
>  process_one_work+0x1dc/0x450
>  worker_thread+0x154/0x450
> 
> as long as the ports are not down (and dsa_switch_shutdown() doesn't ensure it),
> we cannot just zero drvdata, because PHY polling will eventually call
> 
> static int lan9303_mdio_phy_read(struct lan9303 *chip, int addr, int reg)
> {
>         struct lan9303_mdio *sw_dev = dev_get_drvdata(chip->dev);
> 
>         return mdiobus_read_nested(sw_dev->device->bus, addr, reg);
> 
> There are however multiple other unsafe patterns.
> I suppose current
> 
> dsa_switch_shutdown();
> dev_set_drvdata(...->dev, NULL);
> 
> pattern is broken in many cases...

Unfortunately the code portion which you've quoted for your reply does not
show the full story. dsa_switch_shutdown(), at the time of this patch,
was implemented like this (stripped of comments):

void dsa_switch_shutdown(struct dsa_switch *ds)
{
	struct net_device *master, *slave_dev;
	LIST_HEAD(unregister_list);
	struct dsa_port *dp;

	mutex_lock(&dsa2_mutex);
	rtnl_lock();

	list_for_each_entry(dp, &ds->dst->ports, list) {
		if (dp->ds != ds)
			continue;

		if (!dsa_port_is_user(dp))
			continue;

		master = dp->cpu_dp->master;
		slave_dev = dp->slave;

		netdev_upper_dev_unlink(master, slave_dev);
		unregister_netdevice_queue(slave_dev, &unregister_list);
	}
	unregister_netdevice_many(&unregister_list);

	rtnl_unlock();
	mutex_unlock(&dsa2_mutex);
}

I believe you would be wrong to blame this patch for exiting with the
slave/user ports still running (and thus ds->ops->phy_read() still
callable), because, as you can see, it doesn't do that - it unregisters
them, which also stops the net_device prior. So, both phylink_stop() and
phylink_destroy() would be called.

The patch had other problems though, and that led to the rework in
commit ee534378f005 ("net: dsa: fix panic when DSA master device unbinds
on shutdown"), rework which is in fact to blame for what you're reporting.

Given that we are talking about a fix to a fix, it doesn't really matter
in terms of backporting targets which one it is, but for correctness sake,
it is the later patch that fixed some things while introducing the race
condition.