[openwrt/openwrt] mac80211: backport tx queueing bugfixes add a bug fix for a rare crash

LEDE Commits lede-commits at lists.infradead.org
Thu Sep 15 09:17:26 PDT 2022


nbd pushed a commit to openwrt/openwrt.git, branch openwrt-22.03:
https://git.openwrt.org/b3fa0241e2faaf4080f4903c0547949ebcddbc6b

commit b3fa0241e2faaf4080f4903c0547949ebcddbc6b
Author: Felix Fietkau <nbd at nbd.name>
AuthorDate: Sun Mar 27 00:05:49 2022 +0100

    mac80211: backport tx queueing bugfixes add a bug fix for a rare crash
    
    Re-introduce the queue wake fix that was reverted due to a regression,
    but this time with the follow-up fixes that take care of the regression.
    
    Signed-off-by: Felix Fietkau <nbd at nbd.name>
    (cherry picked from commit 9a93b62f315ad4c9f021c414ed80ba337ab4a01e)
    (cherry-picked from commit 8b804cae5e039142bc63896a75f15146eca3bebc)
    (cherry-picked from commit 8b06e06832ebe757246582b65306ad2a2537741f)
---
 ...-not-wake-queues-on-a-vif-that-is-being-s.patch | 38 +++++++++++++++++
 ...11-do-not-abuse-fq.lock-in-ieee80211_do_s.patch | 46 +++++++++++++++++++++
 ...x-deadlock-Don-t-start-TX-while-holding-f.patch | 40 ++++++++++++++++++
 ...sure-vif-queues-are-operational-after-sta.patch | 47 ++++++++++++++++++++++
 4 files changed, 171 insertions(+)

diff --git a/package/kernel/mac80211/patches/subsys/328-mac80211-do-not-wake-queues-on-a-vif-that-is-being-s.patch b/package/kernel/mac80211/patches/subsys/328-mac80211-do-not-wake-queues-on-a-vif-that-is-being-s.patch
new file mode 100644
index 0000000000..f0150ddef0
--- /dev/null
+++ b/package/kernel/mac80211/patches/subsys/328-mac80211-do-not-wake-queues-on-a-vif-that-is-being-s.patch
@@ -0,0 +1,38 @@
+From: Felix Fietkau <nbd at nbd.name>
+Date: Sat, 26 Mar 2022 23:58:35 +0100
+Subject: [PATCH] mac80211: do not wake queues on a vif that is being stopped
+
+When a vif is being removed and sdata->bss is cleared, __ieee80211_wake_txqs
+can still be called on it, which crashes as soon as sdata->bss is being
+dereferenced.
+To fix this properly, check for SDATA_STATE_RUNNING before waking queues,
+and take the fq lock when setting it (to ensure that __ieee80211_wake_txqs
+observes the change when running on a different CPU
+
+Signed-off-by: Felix Fietkau <nbd at nbd.name>
+---
+
+--- a/net/mac80211/iface.c
++++ b/net/mac80211/iface.c
+@@ -377,7 +377,9 @@ static void ieee80211_do_stop(struct iee
+ 	bool cancel_scan;
+ 	struct cfg80211_nan_func *func;
+ 
++	spin_lock_bh(&local->fq.lock);
+ 	clear_bit(SDATA_STATE_RUNNING, &sdata->state);
++	spin_unlock_bh(&local->fq.lock);
+ 
+ 	cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
+ 	if (cancel_scan)
+--- a/net/mac80211/util.c
++++ b/net/mac80211/util.c
+@@ -301,6 +301,9 @@ static void __ieee80211_wake_txqs(struct
+ 	local_bh_disable();
+ 	spin_lock(&fq->lock);
+ 
++	if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
++		goto out;
++
+ 	if (sdata->vif.type == NL80211_IFTYPE_AP)
+ 		ps = &sdata->bss->ps;
+ 
diff --git a/package/kernel/mac80211/patches/subsys/340-wifi-mac80211-do-not-abuse-fq.lock-in-ieee80211_do_s.patch b/package/kernel/mac80211/patches/subsys/340-wifi-mac80211-do-not-abuse-fq.lock-in-ieee80211_do_s.patch
new file mode 100644
index 0000000000..82243f1d98
--- /dev/null
+++ b/package/kernel/mac80211/patches/subsys/340-wifi-mac80211-do-not-abuse-fq.lock-in-ieee80211_do_s.patch
@@ -0,0 +1,46 @@
+From aa40d5a43526cca9439a2b45fcfdcd016594dece Mon Sep 17 00:00:00 2001
+From: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
+Date: Sun, 17 Jul 2022 21:21:52 +0900
+Subject: [PATCH] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
+MIME-Version: 1.0
+Content-Type: text/plain; charset=UTF-8
+Content-Transfer-Encoding: 8bit
+
+lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
+for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
+that is being stopped") guards clear_bit() using fq.lock even before
+fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
+
+According to discussion [2], Toke was not happy with expanding usage of
+fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
+can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
+
+Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
+Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2]
+Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6 at syzkaller.appspotmail.com>
+Signed-off-by: Tetsuo Handa <penguin-kernel at I-love.SAKURA.ne.jp>
+Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
+Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6 at syzkaller.appspotmail.com>
+Acked-by: Toke Høiland-Jørgensen <toke at kernel.org>
+Signed-off-by: Kalle Valo <kvalo at kernel.org>
+Link: https://lore.kernel.org/r/9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp
+[ pick up commit 3598cb6e1862 ("wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()") from -next]
+Link: https://lore.kernel.org/all/87o7xcq6qt.fsf@kernel.org/
+Signed-off-by: Jakub Kicinski <kuba at kernel.org>
+---
+ net/mac80211/iface.c | 3 +--
+ 1 file changed, 1 insertion(+), 2 deletions(-)
+
+--- a/net/mac80211/iface.c
++++ b/net/mac80211/iface.c
+@@ -377,9 +377,8 @@ static void ieee80211_do_stop(struct iee
+ 	bool cancel_scan;
+ 	struct cfg80211_nan_func *func;
+ 
+-	spin_lock_bh(&local->fq.lock);
+ 	clear_bit(SDATA_STATE_RUNNING, &sdata->state);
+-	spin_unlock_bh(&local->fq.lock);
++	synchronize_rcu(); /* flush _ieee80211_wake_txqs() */
+ 
+ 	cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
+ 	if (cancel_scan)
diff --git a/package/kernel/mac80211/patches/subsys/341-mac80211-Fix-deadlock-Don-t-start-TX-while-holding-f.patch b/package/kernel/mac80211/patches/subsys/341-mac80211-Fix-deadlock-Don-t-start-TX-while-holding-f.patch
new file mode 100644
index 0000000000..8c56acbf88
--- /dev/null
+++ b/package/kernel/mac80211/patches/subsys/341-mac80211-Fix-deadlock-Don-t-start-TX-while-holding-f.patch
@@ -0,0 +1,40 @@
+From: Alexander Wetzel <alexander at wetzel-home.de>
+Date: Thu, 15 Sep 2022 14:41:20 +0200
+Subject: [PATCH] mac80211: Fix deadlock: Don't start TX while holding
+ fq->lock
+
+ieee80211_txq_purge() calls fq_tin_reset() and
+ieee80211_purge_tx_queue(); Both are then calling
+ieee80211_free_txskb(). Which can decide to TX the skb again.
+
+There are at least two ways to get a deadlock:
+
+1) When we have a TDLS teardown packet queued in either tin or frags
+   ieee80211_tdls_td_tx_handle() will call ieee80211_subif_start_xmit()
+   while we still hold fq->lock. ieee80211_txq_enqueue() will thus
+   deadlock.
+
+2) A variant of the above happens if aggregation is up and running:
+   In that case ieee80211_iface_work() will deadlock with the original
+   task: The original tasks already holds fq->lock and tries to get
+   sta->lock after kicking off ieee80211_iface_work(). But the worker
+   can get sta->lock prior to the original task and will then spin for
+   fq->lock.
+
+Avoid these deadlocks by not sending out any skbs when called via
+ieee80211_free_txskb().
+
+Signed-off-by: Alexander Wetzel <alexander at wetzel-home.de>
+---
+
+--- a/net/mac80211/status.c
++++ b/net/mac80211/status.c
+@@ -698,7 +698,7 @@ static void ieee80211_report_used_skb(st
+ 
+ 		if (!sdata) {
+ 			skb->dev = NULL;
+-		} else {
++		} else if (!dropped) {
+ 			unsigned int hdr_size =
+ 				ieee80211_hdrlen(hdr->frame_control);
+ 
diff --git a/package/kernel/mac80211/patches/subsys/342-mac80211-Ensure-vif-queues-are-operational-after-sta.patch b/package/kernel/mac80211/patches/subsys/342-mac80211-Ensure-vif-queues-are-operational-after-sta.patch
new file mode 100644
index 0000000000..4310329319
--- /dev/null
+++ b/package/kernel/mac80211/patches/subsys/342-mac80211-Ensure-vif-queues-are-operational-after-sta.patch
@@ -0,0 +1,47 @@
+From: Alexander Wetzel <alexander at wetzel-home.de>
+Date: Thu, 15 Sep 2022 15:09:46 +0200
+Subject: [PATCH] mac80211: Ensure vif queues are operational after start
+
+Make sure local->queue_stop_reasons and vif.txqs_stopped stay in sync.
+
+When a new vif is created the queues may end up in an inconsistent state
+and be inoperable:
+Communication not using iTXQ will work, allowing to e.g. complete the
+association. But the 4-way handshake will time out. The sta will not
+send out any skbs queued in iTXQs.
+
+All normal attempts to start the queues will fail when reaching this
+state.
+local->queue_stop_reasons will have marked all queues as operational but
+vif.txqs_stopped will still be set, creating an inconsistent internal
+state.
+
+In reality this seems to be race between the mac80211 function
+ieee80211_do_open() setting SDATA_STATE_RUNNING and the wake_txqs_tasklet:
+Depending on the driver and the timing the queues may end up to be
+operational or not.
+
+Cc: stable at vger.kernel.org
+Fixes: f856373e2f31 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
+Signed-off-by: Alexander Wetzel <alexander at wetzel-home.de>
+---
+
+--- a/net/mac80211/util.c
++++ b/net/mac80211/util.c
+@@ -301,14 +301,14 @@ static void __ieee80211_wake_txqs(struct
+ 	local_bh_disable();
+ 	spin_lock(&fq->lock);
+ 
++	sdata->vif.txqs_stopped[ac] = false;
++
+ 	if (!test_bit(SDATA_STATE_RUNNING, &sdata->state))
+ 		goto out;
+ 
+ 	if (sdata->vif.type == NL80211_IFTYPE_AP)
+ 		ps = &sdata->bss->ps;
+ 
+-	sdata->vif.txqs_stopped[ac] = false;
+-
+ 	list_for_each_entry_rcu(sta, &local->sta_list, list) {
+ 		if (sdata != sta->sdata)
+ 			continue;




More information about the lede-commits mailing list