[PATCH] ath11k: free peer for station when disconnect from AP for QCA6390/WCN6855

Kalle Valo kvalo at kernel.org
Tue Dec 14 07:37:40 PST 2021


Wen Gong <quic_wgong at quicinc.com> wrote:

> Commit b4a0f54156ac ("ath11k: move peer delete after vdev stop of station
> for QCA6390 and WCN6855") is to fix firmware crash by changing the WMI
> command sequence, but actually skip all the peer delete operation, then
> it lead commit 58595c9874c6 ("ath11k: Fixing dangling pointer issue upon
> peer delete failure") not take effect, and then happened a use-after-free
> warning from KASAN. because the peer->sta is not set to NULL and then used
> later.
> 
> Change to only skip the WMI_PEER_DELETE_CMDID for QCA6390/WCN6855.
> 
> log of user-after-free:
> 
> [  534.888665] BUG: KASAN: use-after-free in ath11k_dp_rx_update_peer_stats+0x912/0xc10 [ath11k]
> [  534.888696] Read of size 8 at addr ffff8881396bb1b8 by task rtcwake/2860
> 
> [  534.888705] CPU: 4 PID: 2860 Comm: rtcwake Kdump: loaded Tainted: G        W         5.15.0-wt-ath+ #523
> [  534.888712] Hardware name: Intel(R) Client Systems NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0067.2021.0528.1339 05/28/2021
> [  534.888716] Call Trace:
> [  534.888720]  <IRQ>
> [  534.888726]  dump_stack_lvl+0x57/0x7d
> [  534.888736]  print_address_description.constprop.0+0x1f/0x170
> [  534.888745]  ? ath11k_dp_rx_update_peer_stats+0x912/0xc10 [ath11k]
> [  534.888771]  kasan_report.cold+0x83/0xdf
> [  534.888783]  ? ath11k_dp_rx_update_peer_stats+0x912/0xc10 [ath11k]
> [  534.888810]  ath11k_dp_rx_update_peer_stats+0x912/0xc10 [ath11k]
> [  534.888840]  ath11k_dp_rx_process_mon_status+0x529/0xa70 [ath11k]
> [  534.888874]  ? ath11k_dp_rx_mon_status_bufs_replenish+0x3f0/0x3f0 [ath11k]
> [  534.888897]  ? check_prev_add+0x20f0/0x20f0
> [  534.888922]  ? __lock_acquire+0xb72/0x1870
> [  534.888937]  ? find_held_lock+0x33/0x110
> [  534.888954]  ath11k_dp_rx_process_mon_rings+0x297/0x520 [ath11k]
> [  534.888981]  ? rcu_read_unlock+0x40/0x40
> [  534.888990]  ? ath11k_dp_rx_pdev_alloc+0xd90/0xd90 [ath11k]
> [  534.889026]  ath11k_dp_service_mon_ring+0x67/0xe0 [ath11k]
> [  534.889053]  ? ath11k_dp_rx_process_mon_rings+0x520/0x520 [ath11k]
> [  534.889075]  call_timer_fn+0x167/0x4a0
> [  534.889084]  ? add_timer_on+0x3b0/0x3b0
> [  534.889103]  ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370
> [  534.889117]  __run_timers.part.0+0x539/0x8b0
> [  534.889123]  ? ath11k_dp_rx_process_mon_rings+0x520/0x520 [ath11k]
> [  534.889157]  ? call_timer_fn+0x4a0/0x4a0
> [  534.889164]  ? mark_lock_irq+0x1c30/0x1c30
> [  534.889173]  ? clockevents_program_event+0xdd/0x280
> [  534.889189]  ? mark_held_locks+0xa5/0xe0
> [  534.889203]  run_timer_softirq+0x97/0x180
> [  534.889213]  __do_softirq+0x276/0x86a
> [  534.889230]  __irq_exit_rcu+0x11c/0x180
> [  534.889238]  irq_exit_rcu+0x5/0x20
> [  534.889244]  sysvec_apic_timer_interrupt+0x8e/0xc0
> [  534.889251]  </IRQ>
> [  534.889254]  <TASK>
> [  534.889259]  asm_sysvec_apic_timer_interrupt+0x12/0x20
> [  534.889265] RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70
> [  534.889271] Code: 74 24 10 e8 ea c2 bf fd 48 89 ef e8 12 53 c0 fd 81 e3 00 02 00 00 75 25 9c 58 f6 c4 02 75 2d 48 85 db 74 01 fb bf 01 00 00 00 <e8> 13 a7 b5 fd 65 8b 05 cc d9 9c 5e 85 c0 74 0a 5b 5d c3 e8 a0 ee
> [  534.889276] RSP: 0018:ffffc90002e5f880 EFLAGS: 00000206
> [  534.889284] RAX: 0000000000000006 RBX: 0000000000000200 RCX: ffffffff9f256f10
> [  534.889289] RDX: 0000000000000000 RSI: ffffffffa1c6e420 RDI: 0000000000000001
> [  534.889293] RBP: ffff8881095e6200 R08: 0000000000000001 R09: ffffffffa40d2b8f
> [  534.889298] R10: fffffbfff481a571 R11: 0000000000000001 R12: ffff8881095e6e68
> [  534.889302] R13: ffffc90002e5f908 R14: 0000000000000246 R15: 0000000000000000
> [  534.889316]  ? mark_lock+0xd0/0x14a0
> [  534.889332]  klist_next+0x1d4/0x450
> [  534.889340]  ? dpm_wait_for_subordinate+0x2d0/0x2d0
> [  534.889350]  device_for_each_child+0xa8/0x140
> [  534.889360]  ? device_remove_class_symlinks+0x1b0/0x1b0
> [  534.889370]  ? __lock_release+0x4bd/0x9f0
> [  534.889378]  ? dpm_suspend+0x26b/0x3f0
> [  534.889390]  dpm_wait_for_subordinate+0x82/0x2d0
> [  534.889400]  ? dpm_for_each_dev+0xa0/0xa0
> [  534.889410]  ? dpm_suspend+0x233/0x3f0
> [  534.889427]  __device_suspend+0xd4/0x10c0
> [  534.889440]  ? wait_for_completion_io+0x270/0x270
> [  534.889456]  ? async_suspend_late+0xe0/0xe0
> [  534.889463]  ? async_schedule_node_domain+0x468/0x640
> [  534.889482]  dpm_suspend+0x25a/0x3f0
> [  534.889491]  ? dpm_suspend_end+0x1a0/0x1a0
> [  534.889497]  ? ktime_get+0x214/0x2f0
> [  534.889502]  ? lockdep_hardirqs_on+0x79/0x100
> [  534.889509]  ? recalibrate_cpu_khz+0x10/0x10
> [  534.889516]  ? ktime_get+0x119/0x2f0
> [  534.889528]  dpm_suspend_start+0xab/0xc0
> [  534.889538]  suspend_devices_and_enter+0x1ca/0x350
> [  534.889546]  ? suspend_enter+0x850/0x850
> [  534.889566]  enter_state+0x27c/0x3d7
> [  534.889575]  pm_suspend.cold+0x42/0x189
> [  534.889583]  state_store+0xab/0x160
> [  534.889595]  ? sysfs_file_ops+0x160/0x160
> [  534.889601]  kernfs_fop_write_iter+0x2b5/0x450
> [  534.889615]  new_sync_write+0x36a/0x600
> [  534.889625]  ? new_sync_read+0x600/0x600
> [  534.889639]  ? rcu_read_unlock+0x40/0x40
> [  534.889668]  vfs_write+0x619/0x910
> [  534.889681]  ksys_write+0xf4/0x1d0
> [  534.889689]  ? __ia32_sys_read+0xa0/0xa0
> [  534.889699]  ? lockdep_hardirqs_on_prepare.part.0+0x18c/0x370
> [  534.889707]  ? syscall_enter_from_user_mode+0x1d/0x50
> [  534.889719]  do_syscall_64+0x3b/0x90
> [  534.889725]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [  534.889731] RIP: 0033:0x7f0b9bc931e7
> [  534.889736] Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
> [  534.889741] RSP: 002b:00007ffd9d34cc88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  534.889749] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f0b9bc931e7
> [  534.889753] RDX: 0000000000000004 RSI: 0000561cd023c5f0 RDI: 0000000000000004
> [  534.889757] RBP: 0000561cd023c5f0 R08: 0000000000000000 R09: 0000000000000004
> [  534.889761] R10: 0000561ccef842a6 R11: 0000000000000246 R12: 0000000000000004
> [  534.889765] R13: 0000561cd0239590 R14: 00007f0b9bd6f4a0 R15: 00007f0b9bd6e8a0
> [  534.889789]  </TASK>
> 
> [  534.889796] Allocated by task 2711:
> [  534.889800]  kasan_save_stack+0x1b/0x40
> [  534.889805]  __kasan_kmalloc+0x7c/0x90
> [  534.889810]  sta_info_alloc+0x98/0x1ef0 [mac80211]
> [  534.889874]  ieee80211_prep_connection+0x30b/0x11e0 [mac80211]
> [  534.889950]  ieee80211_mgd_auth+0x529/0xe00 [mac80211]
> [  534.890024]  cfg80211_mlme_auth+0x332/0x6f0 [cfg80211]
> [  534.890090]  nl80211_authenticate+0x839/0xcf0 [cfg80211]
> [  534.890147]  genl_family_rcv_msg_doit+0x1f4/0x2f0
> [  534.890154]  genl_rcv_msg+0x280/0x500
> [  534.890160]  netlink_rcv_skb+0x11c/0x340
> [  534.890165]  genl_rcv+0x1f/0x30
> [  534.890170]  netlink_unicast+0x42b/0x700
> [  534.890176]  netlink_sendmsg+0x71b/0xc60
> [  534.890181]  sock_sendmsg+0xdf/0x110
> [  534.890187]  ____sys_sendmsg+0x5c0/0x850
> [  534.890192]  ___sys_sendmsg+0xe4/0x160
> [  534.890197]  __sys_sendmsg+0xb2/0x140
> [  534.890202]  do_syscall_64+0x3b/0x90
> [  534.890207]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> [  534.890215] Freed by task 2825:
> [  534.890218]  kasan_save_stack+0x1b/0x40
> [  534.890223]  kasan_set_track+0x1c/0x30
> [  534.890227]  kasan_set_free_info+0x20/0x30
> [  534.890232]  __kasan_slab_free+0xce/0x100
> [  534.890237]  slab_free_freelist_hook+0xf0/0x1a0
> [  534.890242]  kfree+0xe5/0x370
> [  534.890248]  __sta_info_flush+0x333/0x4b0 [mac80211]
> [  534.890308]  ieee80211_set_disassoc+0x324/0xd20 [mac80211]
> [  534.890382]  ieee80211_mgd_deauth+0x537/0xee0 [mac80211]
> [  534.890472]  cfg80211_mlme_deauth+0x349/0x810 [cfg80211]
> [  534.890526]  cfg80211_mlme_down+0x1ce/0x270 [cfg80211]
> [  534.890578]  cfg80211_disconnect+0x4f5/0x7b0 [cfg80211]
> [  534.890631]  cfg80211_leave+0x24/0x40 [cfg80211]
> [  534.890677]  wiphy_suspend+0x23d/0x2f0 [cfg80211]
> [  534.890723]  dpm_run_callback+0xf4/0x1b0
> [  534.890728]  __device_suspend+0x648/0x10c0
> [  534.890733]  async_suspend+0x16/0xe0
> [  534.890737]  async_run_entry_fn+0x90/0x4f0
> [  534.890741]  process_one_work+0x866/0x1490
> [  534.890747]  worker_thread+0x596/0x1010
> [  534.890751]  kthread+0x35d/0x420
> [  534.890756]  ret_from_fork+0x22/0x30
> 
> [  534.890763] The buggy address belongs to the object at ffff8881396ba000
>                 which belongs to the cache kmalloc-8k of size 8192
> [  534.890767] The buggy address is located 4536 bytes inside of
>                 8192-byte region [ffff8881396ba000, ffff8881396bc000)
> [  534.890772] The buggy address belongs to the page:
> [  534.890775] page:ffffea0004e5ae00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1396b8
> [  534.890780] head:ffffea0004e5ae00 order:3 compound_mapcount:0 compound_pincount:0
> [  534.890784] flags: 0x200000000010200(slab|head|node=0|zone=2)
> [  534.890791] raw: 0200000000010200 ffffea000562be08 ffffea0004b04c08 ffff88810004e340
> [  534.890795] raw: 0000000000000000 0000000000010001 00000001ffffffff 0000000000000000
> [  534.890798] page dumped because: kasan: bad access detected
> 
> [  534.890804] Memory state around the buggy address:
> [  534.890807]  ffff8881396bb080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  534.890811]  ffff8881396bb100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  534.890814] >ffff8881396bb180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  534.890817]                                         ^
> [  534.890821]  ffff8881396bb200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  534.890824]  ffff8881396bb280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [  534.890827] ==================================================================
> [  534.890830] Disabling lock debugging due to kernel taint
> 
> Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-01720.1-QCAHSPSWPL_V1_V2_SILICONZ_LITE-1
> 
> Fixes: b4a0f54156ac ("ath11k: move peer delete after vdev stop of station for QCA6390 and WCN6855")
> Signed-off-by: Wen Gong <quic_wgong at quicinc.com>
> Signed-off-by: Kalle Valo <quic_kvalo at quicinc.com>

This didn't compile, but that was easy to fix:

drivers/net/wireless/ath/ath11k/mac.c: In function 'ath11k_mac_op_sta_state':
drivers/net/wireless/ath/ath11k/mac.c:4443:1: error: label 'free' defined but not used [-Werror=unused-label]

But the bigger problem is that this causes new warnings both on QCA6390 and WCN6855:

[  123.051029] ath11k_pci 0000:06:00.0: Found peer entry c0:97:27:8d:ec:99 n vdev 0 after it was supposedly removed
[  123.097662] ath11k_pci 0000:06:00.0: peer-unmap-event: unknown peer id 2
[  144.960346] ath11k_pci 0000:06:00.0: Found peer entry c0:97:27:8d:ec:99 n vdev 0 after it was supposedly removed
[  145.020544] ath11k_pci 0000:06:00.0: peer-unmap-event: unknown peer id 2
[  161.250859] ath11k_pci 0000:06:00.0: Found peer entry c0:97:27:8d:ec:99 n vdev 0 after it was supposedly removed
[  161.329396] ath11k_pci 0000:06:00.0: peer-unmap-event: unknown peer id 5
[  177.333645] ath11k_pci 0000:06:00.0: Found peer entry c0:97:27:8d:ec:99 n vdev 0 after it was supposedly removed
[  177.390962] ath11k_pci 0000:06:00.0: peer-unmap-event: unknown peer id 8
[  196.690035] ath11k_pci 0000:06:00.0: Found peer entry c0:97:27:8d:ec:99 n vdev 0 after it was supposedly removed
[  196.747167] ath11k_pci 0000:06:00.0: peer-unmap-event: unknown peer id 11

Patch set to Changes Requested.

-- 
https://patchwork.kernel.org/project/linux-wireless/patch/20211214024108.10397-1-quic_wgong@quicinc.com/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches




More information about the ath11k mailing list