Question on beacon-miss offloading.

Michal Kazior michal.kazior at tieto.com
Mon Jun 15 22:29:36 PDT 2015


On 15 June 2015 at 21:45, Ben Greear <greearb at candelatech.com> wrote:
> On 06/15/2015 08:35 AM, Ben Greear wrote:
>> On 06/14/2015 10:36 PM, Michal Kazior wrote:
>>> On 12 June 2015 at 17:37, Ben Greear <greearb at candelatech.com> wrote:
>>>> On 06/11/2015 11:03 PM, Michal Kazior wrote:
>>>>> On 12 June 2015 at 01:38, Ben Greear <greearb at candelatech.com> wrote:
>>>>>> On 06/11/2015 02:07 PM, Ben Greear wrote:
>>>>>>> In my ath10k CT firmware, I am disabling the beacon-miss offloading
>>>>>>> to save space and because it will not work with lots of virtual
>>>>>>> stations.
>>>>>>>
>>>>>>> But, it must be that I need some way to tell the stack that this
>>>>>>> feature is not enabled, because when suddenly kill my AP, then
>>>>>>> the ath10k station connected to it shows endless 'beacon loss' events
>>>>>>> in 'iw events' output, but it never actually loses connection.
>>>>>>>
>>>>>>> Stock firmware works fine, so probably I just need to disable
>>>>>>> some feature flag when registering the ath10k hardware
>>>>>>> when using CT firmware.
>>>>>>>
>>>>>>> With stock firmware, I see a quick dissassociation due to inactivity.
>>>>>>>
>>>>>>> I am having poor luck finding how a driver tells the stack
>>>>>>> it has beacon miss offload or not, so, does anyone know how
>>>>>>> this is controlled?
>>>>>>
>>>>>> I still am not sure why stock firmware works, but it appears
>>>>>> the reason mine is failing is that the ACK status for mgt frames
>>>>>> is always set to TRUE since the ath10k wmi-mgt-tx API is so
>>>>>> lame.  So, mac80211 does a probe, ath10k lies and says it was
>>>>>> acked, and mac80211 then things all is well for another few
>>>>>> seconds.
>>>>>
>>>>> mac80211 shouldn't do a Probe Req to an AP on beacon loss because
>>>>> ath10k advertises it supports tx-status report. Hence mac80211 should
>>>>> use NullFunc frames which shouldn't go through wmi-mgmt-tx but htt
>>>>> tx-frm.
>>>>>
>>>>> But then again: NullFunc status reporting via htt tx-frm was broken on
>>>>> 10.1 if memory serves right. I believe it was fixed in 10.2 or 10.2.4.
>>>>>
>>>>> This problem has been effectively obscured on stock 10.1 by the
>>>>> offloaded beacon miss.
>>>>
>>>> Thanks for the hint.  I was able to fix my firmware to properly
>>>> return htt tx status, and now it appears to work properly.
>>>>
>>>> A quick throughput test works as well, so hopefully no regressions.
>>>>
>>>> I guess the NulFunc related comment is incorrect for 10.1 stock firmware?
>>>>
>>>> Maybe some others?
>>>>
>>>> static void ath10k_tx_htt(struct ath10k *ar, struct sk_buff *skb)
>>>> {
>>>>         struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
>>>>         int ret = 0;
>>>>
>>>>         if (ar->htt.target_version_major >= 3) {
>>>>                 /* Since HTT 3.0 there is no separate mgmt tx command */
>>>>                 ret = ath10k_htt_tx(&ar->htt, skb);
>>>>                 goto exit;
>>>>         }
>>>>
>>>>         if (ieee80211_is_mgmt(hdr->frame_control)) {
>>>>                 if (test_bit(ATH10K_FW_FEATURE_HAS_WMI_MGMT_TX,
>>>>                              ar->fw_features)) {
>>>>                         if (skb_queue_len(&ar->wmi_mgmt_tx_queue) >=
>>>>                             ATH10K_MAX_NUM_MGMT_PENDING) {
>>>>                                 ath10k_warn(ar, "reached WMI management transmit queue limit\n");
>>>>                                 ret = -EBUSY;
>>>>                                 goto exit;
>>>>                         }
>>>>
>>>>                         skb_queue_tail(&ar->wmi_mgmt_tx_queue, skb);
>>>>                         ieee80211_queue_work(ar->hw, &ar->wmi_mgmt_tx_work);
>>>>                 } else {
>>>>                         ret = ath10k_htt_mgmt_tx(&ar->htt, skb);
>>>>                 }
>>>>         } else if (!test_bit(ATH10K_FW_FEATURE_HAS_WMI_MGMT_TX,
>>>>                              ar->fw_features) &&
>>>>                    ieee80211_is_nullfunc(hdr->frame_control)) {
>>>>                 /* FW does not report tx status properly for NullFunc frames
>>>>                  * unless they are sent through mgmt tx path. mac80211 sends
>>>>                  * those frames when it detects link/beacon loss and depends
>>>>                  * on the tx status to be correct. */
>>>>                 ret = ath10k_htt_mgmt_tx(&ar->htt, skb);
>>>>         } else {
>>>>                 ret = ath10k_htt_tx(&ar->htt, skb);
>>>>         }
>>>
>>> The NullFunc workaround was originally done for 999.999.0.636 but
>>> should be true for 10.1 as well with the sole exception the latter
>>> doesn't have htt-mgmt-tx to workaround the problem.
>>
>> Is it correct to say that this logic is completely broken for stock 10.1 firmware
>> because null-func frames are not management frames, so the packet goes out the
>> ath10k_htt_mgmt_tx call, and stock 10.1 does not properly do tx status
>> for htt-tx frames?

The has-wmi-mgmt-tx was originally designed to identify FW supporting
htt-mgmt-tx. It ended up to be used (implicitly) as a part of the
NullFunc workaround.

Hmm.. I guess it's not really ideal now because latest 10.2.4 FW which
supports htt-mgmt-tx doesn't need the workaround (htt-tx report
correct tx-status for NullFuncs)..


>> And slightly different question:  Once I put in proper htt ACK reporting into
>> my firmware, I notice that stations often loose connectivity when they are
>> idle.  I captured a sniff, and it appears the null-func packets are sent,
>> but the header is not requesting an explicit ACK, and the AP does not ACK it,
>> so connection is lost.
>>
>> I guess that the null-func frames should be requesting explicit ACK?
>
> This patch below (with firmware fix to make it actually pay attention
> to this special TID in station mode), fixes the null-func to be non-qos
> frames and get immediate ack.  This in turn seems to fix the disconnect
> issue I was facing, though I need to test some more to be sure.
>
> This seem reasonable, maybe even for upstream?
>
> [greearb at ben-dt2 ath10k]$ git diff
> diff --git a/drivers/net/wireless/ath/ath10k/mac.c b/drivers/net/wireless/ath/ath10k/mac.c
> index 6ca48e3..610447e 100644
> --- a/drivers/net/wireless/ath/ath10k/mac.c
> +++ b/drivers/net/wireless/ath/ath10k/mac.c
> @@ -2213,6 +2213,9 @@ static u8 ath10k_tx_h_get_tid(struct ieee80211_hdr *hdr)
>         if (ieee80211_is_mgmt(hdr->frame_control))
>                 return HTT_DATA_TX_EXT_TID_MGMT;
>
> +       if (ieee80211_is_nullfunc(hdr->frame_control))
> +               return HTT_DATA_TX_EXT_TID_NON_QOS_MCAST_BCAST;
> +

That's weird.. QoS NullFunc frames are already converted to non-QoS
frames, see ath10k_tx_h_nwifi():

        /* Some firmware revisions don't handle sending QoS NullFunc well.
         * These frames are mainly used for CQM purposes so it doesn't really
         * matter whether QoS NullFunc or NullFunc are sent.
         */
        hdr = (void *)skb->data;
        if (ieee80211_is_qos_nullfunc(hdr->frame_control))
                cb->htt.tid = HTT_DATA_TX_EXT_TID_NON_QOS_MCAST_BCAST;

Or is it that ath10k_tx_h_get_tid() has a bug that ends up assigning
an invalid tid for non-QoS NullFunc frames? If so I guess your patch
would be okay with some comments/commit-log explaining why/what was
wrong.


Michał



More information about the ath10k mailing list