[OpenWrt-Devel] Memory leak related to OpenWrt patch of hostapd

Nick Schaf nick.schaf at jci.com
Fri Aug 2 12:23:54 EDT 2019



> Nick Schaf <nick.schaf at jci.com> [2019-07-31 16:34:36]:
> 
> Hi,
> 
> > I've noticed the wpa_supplicant process on my mesh interfaces leaking
> > memory to the point that the kernel kills the process.  It was
> > discovered in 18.06.2, but I've reproduced it with 18.06.4 and with
> > the master branch from the GitHub repo.  Since the leak occurs as mesh
> > links are created and destroyed, I was able to reproduce it with a
> > simple two-node setup where I monitor the wpa_supplicant process VSZ
> > on one node and repeatedly bring wifi up and down on the other node.
> >
> > I've traced it back to the 18.06.2 release, specifically to lines
> > 34-35 of
> > package/network/services/hostapd/patches/015-mesh-do-not-use-
> offchan-m
> > gmt-tx-on-DFS.patch
> > +                 (modes = nl80211_get_hw_feature_data(bss,
> > + &num_modes,
> > &flags, +
> > &dfs_domain)) && That code was added in
> > a35f24309021c1c0e9cbed0faedf58b941cb4bd3.
> >
> > I removed the entire patch file to resolve the memory leak because the
> > subsequent call to ieee80211_is_dfs() uses the return value from
> > nl80211_get_hw_feature_data().  However, I know the problem is
> > specifically related to the nl80211_get_hw_feature_data() call because
> > I stepped-backward through commits of the hostapd source until I got
> > back to 0f7fc6b98de9c69f511b9b22f2b65553126362eb, where
> > ieee80211_is_dfs() had only one argument and didn't rely on the
> > nl80211_get_hw_feature_data() return value.  At that point, the memory
> > leak still occurred until I commented-out the call to
> nl80211_get_hw_feature_data().
> >
> > I attempted to dive into nl80211_get_hw_feature_data(), but was
> > quickly lost, so I defer to those that are more experienced in that code.
> 
> you did a nice job here to track it down, so thanks for reporting this, can you
> try this patch[1]?
> 

I had already tried an os_free(modes) and found no resolution.  However, to be sure, I tried your patch today and still observe the leak, but also checked original code to determine whether the leak rate reduced with the patch.  From that test (data below) it seems possible that the modes leak I might be a small portion of the overall leak I observed.
I still suspect the main leak to be somewhere inside nl80211_get_hw_feature_data.

For your reference, data from today's quick test is below.  VSZ is "VmSize" from /proc/[PID]/status where PID=wpa_supplicant's process ID.  Unpatched is the clean 18.06.4 code.  Patched is the same with your patch applied.
The other node cycles the connection ~ every 30 seconds (while [ 1 ]; do wifi down; sleep 10; wifi; sleep 20; done).
We don't see a rise in memory every 30 seconds, leading me to believe the leaked memory was allocated from a memory pool and the pool size needs to be periodically increased as the leak continues.

Time (s),VSZ unpatched,VSZ patched
0,3408,3404
10,3408,3408
20,3408,3416
30,3408,3416
40,3408,3420
50,3408,3440
60,3408,3440
70,3412,3440
80,3432,3440
90,3432,3440
100,3432,3440
110,3432,3464
120,3432,3464
130,3432,3464
140,3432,3464
150,3432,3464
160,3436,3464
170,3456,3464
180,3456,3464
190,3456,3464
200,3456,3464
210,3456,3464
220,3460,3464
230,3480,3468
,,3468
,,3468
,,3472
,,3472
,,3472
,,3496

_______________________________________________
openwrt-devel mailing list
openwrt-devel at lists.openwrt.org
https://lists.openwrt.org/mailman/listinfo/openwrt-devel



More information about the openwrt-devel mailing list