[PATCH v2] nl80211: wait on udev when creating new device
Benjamin Berg
benjamin at sipsolutions.net
Tue Sep 30 05:50:34 PDT 2025
On Tue, 2025-09-30 at 11:36 +0300, Jouni Malinen wrote:
> On Thu, Jun 05, 2025 at 09:22:11AM +0200, Benjamin Berg wrote:
> > udev/systemd will process new network device. This can result in various
> > issues as for example the MAC address may be randomized. Add the
> > appropriate integration to wait for the udev "add" event before
> > continuing to use the device.
> >
> > This resolves race conditions when reading the MAC address during
> > interface creation (or even changing it right afterwards when creating a
> > P2P device).
> >
> > Enable this feature by default. Systems that do not use udev need to
> > explicitly disable it at compile time.
>
> This feels quite painful.. For example, this broke the build for me due
> to no libudev.h being installed on the system. Why would this need to be
> hardcoded to be enabled for all builds that include driver_nl80211.c?
It doesn't need to. But the option is easy to disabled and having it
default to "enabled" means that it is harder to accidentally forget it.
Said differently, I fear distribution will simply forget to enable the
udev integration and we then continue to have users with race
conditions during P2P setup.
>
> > See https://github.com/systemd/systemd/issues/13642
>
> How easy it is to hit this issue in practice? That case mentions hwsim..
> Does this happen easily outside testing environment?
Unfortunately, it has been a very long time. It looks like I originally
ran into the problem in the NetworkManager P2P tests (mail to the
hostap list in September 2019). I do not remember whether I tried to
reproduce it with real hardware.
It is a race condition. I am not sure how many distributions even
enable mac randomization and I also do not know how likely it is to hit
the issue in that case. It may be that there are only few people
actually affected or that affected users succeed often enough that they
simply try again.
This has been in my personal backlog from back in 2019 when I was
working on gnome-network-displays (Miracast implementation for GNOME).
I think I still have one or two of the Miracast dongles so I could
possibly try against real HW whether the issue is reproducible. That
said, I do feel that a hwsim reproduction is good enough to prove that
we have a udev/systemd integration problem.
> > diff --git a/src/drivers/driver_nl80211.c b/src/drivers/driver_nl80211.c
>
> > @@ -6239,6 +6242,43 @@ static int nl80211_create_iface_once(struct wpa_driver_nl80211_data *drv,
> > +#ifndef CONFIG_DRIVER_NL80211_DISABLE_UDEV
> > + /*
> > + * systemd/udev insist on processing new interfaces and may
> > + * randomize the MAC address. We need to avoid race conditions between
> > + * hostap reading the MAC address and systemd/udev changing it.
> > + * Setup a monitor and wait for an event for a "wlan" "net" device
> > + * with the expected IFINDEX.
> > + * We are guaranteed to receive an event because we install the monitor
> > + * before creating it.
> > + */
>
> This feels like something that should be done only if it can be
> determined that this issue is present in the system and in particular,
> not repeat this for every created interface.
We cannot discover that. Waiting for the udev add event is the correct
way to ensure that the device has been configured. I believe this is a
fundamental design decision in udev and everyone needs to adhere to
that[1].
> > + udev = udev_new();
> > + if (udev) {
> > + monitor = udev_monitor_new_from_netlink(udev, "udev");
> > + if (!monitor)
> > + wpa_printf(MSG_ERROR, "nl80211: Failed to create udev monitor");
> > + } else {
> > + wpa_printf(MSG_ERROR, "nl80211: Failed to connect to udev");
> > + }
>
> Are those really errors on systems that do not use udev?
These errors should never happen, even if the system does not use udev.
The kernel lets us connect to the nl80211 socket even if udev is not
running.
> > @@ -6282,6 +6322,46 @@ static int nl80211_create_iface_once(struct wpa_driver_nl80211_data *drv,
> > +#ifndef CONFIG_DRIVER_NL80211_DISABLE_UDEV
> > + if (monitor) {
> > + /* Set blocking mode on the FD */
> > + int fd = udev_monitor_get_fd(monitor);
> > + int flags = fcntl(fd, F_GETFL);
> > +
> > + fcntl(fd, F_SETFL, flags & ~O_NONBLOCK);
> > +
> > + while (1) {
>
> This is something I would rather not see unless it can be shown that the
> issue running this is indeed going to have this issue.
I would prefer not having to do this, but I do not think we have a
choice here. And it would also be nicer if the wait was non-blocking.
Benjamin
[1] wpa_supplicant is not the only one getting this wrong. I also still
have a libusb bug open where it is not correctly waiting on udev. That
then causes integration issues with selinux and can break fprintd.
Unfortunately, we could not yet merge the fix because other software
uses libusb incorrectly and breaks …
More information about the Hostap
mailing list