Problems when using hostapd on meshnode

Kai Scharwies kai
Tue Jan 3 08:23:03 PST 2012


Hi Javier.

I applied your mac80211_Use_the_right_headroom_size_for_mesh_mgmt_frames.patch,
but the problem seems to persist.

When starting Hostapd on one (virtual) interface, eg. ap0, then
issuing mesh join on another (with or without Authsae) the plink
cycles between LISTEN, OPN_SNT and HOLDING with ever changing mesh
llids.
Reversing the order (let mesh plink establish, then start hostapd)
seems to work, although mesh data transfer is cut off for several
seconds when hostapd is started and sometimes "ath: Failed to stop TX
DMA, queues=0x001!" is shown in the log.
Ping and iperf work for a while, but eventually all network interfaces
are dead. The machine still runs and the interfaces are up, but even a
wired Ethernet connection won't work until next reboot.

At this point a look into /sys/kernel/debug/ieee80211/phy0/ath9k/misc reveals:
...
OP-Mode: MESH(7)
Beacon-Timer-Register: 0x0
Timer-Mode-Register: 0x100000 ()
imask: 0xf4011071 (SWBA CST RX RXHP )
VIF Counts: AP: 0 STA: 0 MESH: 1 WDS: 0 ADHOC: 0 OTHER: 0 nvifs: 1
beacon-vifs: 1
...

while initially it was:
...
OP-Mode: AP(3)
Beacon-Timer-Register: 0x0
Timer-Mode-Register: 0x100000 ()
imask: 0xf4011071 (SWBA CST RX RXHP )
VIF Counts: AP: 1 STA: 0 MESH: 1 WDS: 0 ADHOC: 0 OTHER: 0 nvifs: 2
beacon-vifs: 2
...


Best regards,
Kai


2011/12/23 Javier Cardona <javier at cozybit.com>:
> Hi Simon and Kai,
>
> Thanks for reporting this issue. ?This seems to be a sk buffer
> overrun, probably caused by this patch:
>
> commit 3b69a9c5f264d62a0cf46ea61ed3da732c1f88c2
> Author: Thomas Pedersen <thomas at cozybit.com>
> Date: ? Wed Oct 26 14:47:25 2011 -0700
>
> ? ?mac80211: comment allocation of mesh frames
>
> ? ?Remove most references to magic numbers, save a few bytes and hopefully
> ? ?improve readability.
>
> ? ?Signed-off-by: Thomas Pedersen <thomas at cozybit.com>
> ? ?Signed-off-by: John W. Linville <linville at tuxdriver.com>
>
> We'll fix that next year, when Thomas or myself return from vacation.
> If you need something sooner than that, you can probably work around
> the issue by increasing the allocated sk buffer size. ?Or you can try
> the patch that I'll submit shortly to the list.
>
> Cheers,
>
> Javier
>
>
>
>
> On Thu, Dec 22, 2011 at 7:54 AM, Simon Morgenthaler
> <s.morgenthaler at students.unibe.ch> wrote:
>> Hi
>>
>> I guess, I'm having exactly the same issue, but using an Atheros AR2315 SoC
>> and the ath5k driver and an older compat-wireless packet (2011-07-24) with a
>> 2.6.37.6 kernel.
>>
>> I have the same setup, hostapd running on wlan0 and a mesh network (80211s) on
>> mesh0. Both interfaces are the same phy0 and are bridged with a br0 interface.
>>
>> Sometimes the setup seems to work, but often the whole node crashes (most
>> likely the kernel) after a few minutes and reboots.
>>
>> Any idea?
>>
>> Thanks!
>>
>> Simon
>>
>> On Thursday 22 December 2011 16:39:25 Kai Scharwies wrote:
>>> Hello.
>>>
>>> I was trying to create a meshnode which is also a hotspot using the
>>> following steps:
>>> 1. Create virtual interfaces
>>> iw dev wlan0 del
>>> iw phy phy0 interface add mesh0 type mp
>>> iw phy phy0 interface add ap0 type mp (hostapd will change the type to AP)
>>> 2. Set ip-adresses, enable routing, etc.
>>> 3. Start hostapd for interface ap0 (channel 13 HT40-) - Clients can connect
>>> ?fine 4. On interface mesh0 start authsae or issue unencrypted mesh (on
>>> ?same channel of course) join via iw on both meshnode and (mesh)gateway
>>> (same result)
>>> Station dump shows states to switch between OPEN_SNT, HOLDING, LISTEN,
>>> but never ESTAB.
>>>
>>> As soon as hostapd is killed and step 4 is repeated the mesh is working
>>> ?fine. Issuing step 3 after step 4 seems to work but eventually the mesh
>>> de-establishes or even the kernel crashes:
>>>
>>> [ 5762.749369] ath: Failed to stop TX DMA, queues=0x10f!
>>> [ 5762.754738] skb_over_panic: text:d0cc78f3 len:43 put:17
>>> head:cfdcb400 data:cfdcb42e tail:0xcfdcb459 end:0xcfdcb450 dev:<NULL>
>>> [ 5762.766246] ------------[ cut here ]------------
>>> [ 5762.769483] kernel BUG at net/core/skbuff.c:127!
>>> [ 5762.769483] invalid opcode: 0000 [#1]
>>> [ 5762.769483] last sysfs file: /sys/devices/virtual/net/lo/operstate
>>> [ 5762.769483] Modules linked in: iptable_filter aes_i586 aes_generic
>>> ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack
>>> nf_defrag_ipv4 ip_tables x_tables ipv6 arc4 ath9k mac80211 cfg80211
>>> ath9k_common ath9k_hw ath rtc_cmos ftdi_sio rtc_core compat usbserial
>>> rtc_lib evdev led_class ext4 jbd2 crc16 ohci_hcd r6040 ehci_hcd mii
>>> usbcore [last unloaded: scsi_wait_scan]
>>> [ 5762.769483]
>>> [ 5762.769483] Pid: 1092, comm: phy0 Not tainted 2.6.34.10 #1 /
>>> [ 5762.769483] EIP: 0060:[<c11f7792>] EFLAGS: 00000292 CPU: 0
>>> [ 5762.769483] EIP is at skb_over_panic+0x32/0x40
>>> [ 5762.769483] EAX: 00000087 EBX: d0cc78f3 ECX: cfd01d8c EDX: c130d274
>>> [ 5762.769483] ESI: 0000005e EDI: cff092e0 EBP: 0000003f ESP: cfd01d88
>>> [ 5762.769483] ?DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
>>> [ 5762.769483] Process phy0 (pid: 1092, ti=cfd00000 task=cf30c760
>>> task.ti=cfd00000)
>>> [ 5762.769483] Stack:
>>> [ 5762.769483] ?c130d274 d0cc78f3 0000002b 00000011 cfdcb400 cfdcb42e
>>> cfdcb459 cfdcb450
>>> [ 5762.769483] <0> c130b44e cfdcb448 c11f94c3 cf11f420 d0cc78f3
>>> cf740a60 1fdf4140 cf1c5d80
>>> [ 5762.769483] <0> 00000018 cff9c120 cf498000 00000001 cf11f420
>>> d0cc528e 0000003f d0cdce08
>>> [ 5762.769483] Call Trace:
>>> [ 5762.769483] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5762.769483] ?[<c11f94c3>] ? skb_put+0x33/0x40
>>> [ 5762.769483] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5762.769483] ?[<d0cc528e>] ? mesh_plink_broken+0x7e/0xb0 [mac80211]
>>> [ 5762.769483] ?[<d0c9a29f>] ? ieee80211_tx_status+0x9cf/0xc10 [mac80211]
>>> [ 5762.769483] ?[<c10a5a57>] ? kmem_cache_free+0x57/0x80
>>> [ 5762.769483] ?[<d0d42555>] ? ath_tx_complete_buf+0xd5/0x130 [ath9k]
>>> [ 5762.769483] ?[<d0d44281>] ? ath_drain_txq_list+0xe1/0x140 [ath9k]
>>> [ 5762.769483] ?[<d0d44328>] ? ath_draintxq+0x48/0x180 [ath9k]
>>> [ 5762.769483] ?[<d0d45c3e>] ? ath_drain_all_txq+0xfe/0x140 [ath9k]
>>> [ 5762.769483] ?[<d0d3bf46>] ? ath_prepare_reset+0x46/0xb0 [ath9k]
>>> [ 5762.769483] ?[<d0d3d78f>] ? ath_reset_internal+0x5f/0x1a0 [ath9k]
>>> [ 5762.769483] ?[<d0d3db30>] ? ath_reset_work+0x0/0x10 [ath9k]
>>> [ 5762.769483] ?[<d0d3d8ee>] ? ath_reset+0x1e/0x80 [ath9k]
>>> [ 5762.769483] ?[<d0d3db30>] ? ath_reset_work+0x0/0x10 [ath9k]
>>> [ 5762.769483] ?[<c1034f1f>] ? worker_thread+0x10f/0x200
>>> [ 5762.769483] ?[<c1038240>] ? autoremove_wake_function+0x0/0x30
>>> [ 5762.769483] ?[<c1034e10>] ? worker_thread+0x0/0x200
>>> [ 5762.769483] ?[<c1037eb4>] ? kthread+0x64/0x70
>>> [ 5762.769483] ?[<c1037e50>] ? kthread+0x0/0x70
>>> [ 5762.769483] ?[<c1002e36>] ? kernel_thread_helper+0x6/0x10
>>> [ 5762.769483] Code: c9 74 2f 51 ff b0 a0 00 00 00 ff b0 9c 00 00 00
>>> ff b0 a8 00 00 00 ff b0 a4 00 00 00 52 ff 70 50 53 68 74 d2 30 c1 e8
>>> 32 2e 08 00 <0f> 0b 83 c4 24 eb fe b9 4e b4 30 c1 eb ca 55 57 56 53 83
>>> ec 04
>>> [ 5762.769483] EIP: [<c11f7792>] skb_over_panic+0x32/0x40 SS:ESP
>>> ?0068:cfd01d88 [ 5763.039324] ---[ end trace b99b70cd3652a432 ]---
>>> [ 5763.044013] Kernel panic - not syncing: Fatal exception in interrupt
>>> [ 5763.050450] Pid: 1092, comm: phy0 Tainted: G ? ? ?D ? ?2.6.34.10 #1
>>> [ 5763.056788] Call Trace:
>>> [ 5763.059314] ?[<c127a56a>] ? panic+0x37/0x91
>>> [ 5763.063577] ?[<c1005027>] ? oops_end+0x77/0x80
>>> [ 5763.068099] ?[<c1003636>] ? do_invalid_op+0x66/0x80
>>> [ 5763.073058] ?[<c11f7792>] ? skb_over_panic+0x32/0x40
>>> [ 5763.078104] ?[<c1002e29>] ? common_interrupt+0x29/0x30
>>> [ 5763.083333] ?[<c1022807>] ? vprintk+0x107/0x340
>>> [ 5763.088042] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5763.094591] ?[<c127c20e>] ? error_code+0x5e/0x70
>>> [ 5763.099385] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5763.106017] ?[<d0cc007b>] ? sta_ht_capa_read+0x2bb/0x5d0 [mac80211]
>>> [ 5763.112385] ?[<c10035d0>] ? do_invalid_op+0x0/0x80
>>> [ 5763.117256] ?[<c11f7792>] ? skb_over_panic+0x32/0x40
>>> [ 5763.122399] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5763.128941] ?[<c11f94c3>] ? skb_put+0x33/0x40
>>> [ 5763.133471] ?[<d0cc78f3>] ? mesh_path_error_tx+0x123/0x210 [mac80211]
>>> [ 5763.140107] ?[<d0cc528e>] ? mesh_plink_broken+0x7e/0xb0 [mac80211]
>>> [ 5763.146467] ?[<d0c9a29f>] ? ieee80211_tx_status+0x9cf/0xc10 [mac80211]
>>> [ 5763.153096] ?[<c10a5a57>] ? kmem_cache_free+0x57/0x80
>>> [ 5763.158246] ?[<d0d42555>] ? ath_tx_complete_buf+0xd5/0x130 [ath9k]
>>> [ 5763.164524] ?[<d0d44281>] ? ath_drain_txq_list+0xe1/0x140 [ath9k]
>>> [ 5763.170721] ?[<d0d44328>] ? ath_draintxq+0x48/0x180 [ath9k]
>>> [ 5763.176392] ?[<d0d45c3e>] ? ath_drain_all_txq+0xfe/0x140 [ath9k]
>>> [ 5763.182495] ?[<d0d3bf46>] ? ath_prepare_reset+0x46/0xb0 [ath9k]
>>> [ 5763.188510] ?[<d0d3d78f>] ? ath_reset_internal+0x5f/0x1a0 [ath9k]
>>> [ 5763.194700] ?[<d0d3db30>] ? ath_reset_work+0x0/0x10 [ath9k]
>>> [ 5763.200365] ?[<d0d3d8ee>] ? ath_reset+0x1e/0x80 [ath9k]
>>> [ 5763.205683] ?[<d0d3db30>] ? ath_reset_work+0x0/0x10 [ath9k]
>>> [ 5763.211345] ?[<c1034f1f>] ? worker_thread+0x10f/0x200
>>> [ 5763.216480] ?[<c1038240>] ? autoremove_wake_function+0x0/0x30
>>> [ 5763.222310] ?[<c1034e10>] ? worker_thread+0x0/0x200
>>> [ 5763.227267] ?[<c1037eb4>] ? kthread+0x64/0x70
>>> [ 5763.231704] ?[<c1037e50>] ? kthread+0x0/0x70
>>> [ 5763.236051] ?[<c1002e36>] ? kernel_thread_helper+0x6/0x10
>>>
>>>
>>> I am using Atheros 92xx cards and a compat-wireless built from the
>>> o80211s git tree a couple of days ago.
>>>
>>> Usually this setup should be doable, right?
>>> I would be glad to help resolving the issues causing this.
>>>
>>> Is there another mailinglist which should be set on CC? (Maybe
>>> ath9k-devel, wireless-dev, hostap)
>>>
>>> Best regards,
>>> Kai
>>> _______________________________________________
>>> Devel mailing list
>>> Devel at lists.open80211s.org
>>> http://open80211s.com/mailman/listinfo/devel
>>>
>> _______________________________________________
>> Devel mailing list
>> Devel at lists.open80211s.org
>> http://open80211s.com/mailman/listinfo/devel
>
>
>
> --
> Javier Cardona
> cozybit Inc.
> http://www.cozybit.com
> _______________________________________________
> Devel mailing list
> Devel at lists.open80211s.org
> http://open80211s.com/mailman/listinfo/devel



More information about the Hostap mailing list