[PATCH] ath10k: handle cycle count wrap around

Michal Kazior michal.kazior at tieto.com
Thu May 14 22:51:18 PDT 2015


On 14 May 2015 at 20:40, Srinivasa Duvvuri <sduvvuri at google.com> wrote:
> On Wed, May 13, 2015 at 10:25 PM, Michal Kazior <michal.kazior at tieto.com> wrote:
>> On 14 May 2015 at 03:56, Srinivasa Duvvuri <sduvvuri at google.com> wrote:
>>> When the cycle counts part of wmi_ch_info_ev_arg reach max value of
>>> 0xffffffff The HW/FW right shifts all the counters by 1 . The cycle
>>> count will be reset to 0x7fffffff and starts counting from 0x7fffffff.
>>> At the same time all the other counters will also have their values
>>> right shifted by 1 (divided by 2). There is no way to handle this odd
>>> wrap around. Detect and ignore the wrap around case.
>>> This change has 2 parts.
>>>
>>>  1) Ignore all the cycle counts , if the cycle count field wraps around.
>>>  2) With HW frequency of 88MHz the cycle count wraps around every 24
>>> seconds. the survey_last_cycle_count saved will become stale with in
>>> 24 seconds. Reset it to 0 at every scan start.
>>
>> So what does it fix from a user perspective?
>
> User gets a really large/incorrect value at random times when he dumps
>  the survey data
> using iw dev wlan0 survey dump

This should be put in the commit log so anyone reading it will
understand what this patch fixes from the user point of view.


>>> Signed-off-by: Srinivasa Duvvuri <sduvvuri at chromium.org>
>>>
>>> diff --git a/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> b/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> index 720b6eb..09616bb 100644
>>> --- a/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> +++ b/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>
>> This doesn't look like an upstream kernel patch. Rebase it for
>> github.com/kvalo/ath, please.
>>
>>
>>> @@ -1158,6 +1158,8 @@
>>>   complete(&ar->scan.started);
>>>   break;
>>>   }
>>> + /* reset the last cycle count as the last cycle count is stale from
>>> last scan. */
>>> + ar->survey_last_cycle_count = 0;
>>>  }
>>>
>>>  static void ath10k_wmi_event_scan_start_failed(struct ath10k *ar)
>>> @@ -1716,21 +1718,36 @@
>>>   goto exit;
>>>   }
>>>
>>> - if (cmd_flags & WMI_CHAN_INFO_FLAG_COMPLETE) {
>>> + if (cmd_flags & WMI_CHAN_INFO_FLAG_COMPLETE && ar->survey_last_cycle_count) {
>>
>> Channel info events come in pairs: without and with FLAG_COMPLETE
>> respectively. Thus it is pointless to reset
>> `ar->survey_last_cycle_count` and it's pointless to use it in this
>> condition.
>
> You are right, this check seems to be not useful, but it addresses one
> case where between 2 scans
> FW generates 2 events back to back with WMI_CHAN_INFO_FLAG_COMPLETE flag set.
>  here is how the events are generated from FW with 2 scans issued  .
>
>  scan req 1 to FW.
>  scan started  event from FW.
>  scan bss_channel  event from FW .
>  chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
>  scan foreign channel event from FW
>  chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag cleared.
>  scan bss_channel  event from FW .
>  chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
>   ......
>   ......
>  scan foreign channel event from FW
>  chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag cleared.
>  scan bss_channel  event from FW .
>  chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
>  scan event completed.
>
>
>   scan req 2 to FW
>   scan started  event from FW.
>   scan bss_channel  event from FW .
>   chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
>   ............
>   ............
>
>
> From the events above there is one chan info event from FW with the
> complete flag at the end of first scan
> and  one chan info event  from FW at the beginning of second scan. The
> logic in ath10k computes the
> difference between the 2 events . Depending upon the time at which the
> second scan is issued , the HW cycle
> count could have rolled over several times. the check above ignores
> the first chan info event.
>
> Alternatively we can fix the FW to not generate first bss chan event
> and the corresponding chan info event.
> I guess the idea of having a bss chan event at the beginning and end
> of scan followed by the chan
> info event is to provide the survey  data on bss channel between the
> scans. Since these counters wrap
> around every 24 seconds,  the difference is not going to be reliable any way.

Oh. I've never noticed that. I guess that's because I've mostly tested
survey with `iw scan` which scans all channels including BSS channel.
Otherwise this problem doesn't manifest because BSS-related chan-info
pair is overwritten by subsequent foreign-related chan-info.


>>>   /* During scanning chan info is reported twice for each
>>> - * visited channel. The reported cycle count is global
>>> + * visited channel. The reported cycle count is global
>>>   * and per-channel cycle count must be calculated */
>>>
>>> - cycle_count -= ar->survey_last_cycle_count;
>>> - rx_clear_count -= ar->survey_last_rx_clear_count;
>>> -
>>>   survey = &ar->survey[idx];
>>> - survey->channel_time = WMI_CHAN_INFO_MSEC(cycle_count);
>>> - survey->channel_time_rx = WMI_CHAN_INFO_MSEC(rx_clear_count);
>>> +
>>> + if (cycle_count  <  ar->survey_last_cycle_count) {
>>> + /*
>>> +  * Ignore the wrapped ar ust_cycnd data. When the total cycle count
>>> +  * wraps around, FW/HW will right shift all the counts
>>> +  * (including total cycle counts) by 1.The cycle counts wrap around
>>> +  * every 24 seconds. To handle this wrap around behavior,
>>> +  * we need the value of each counter at the time of wrap around which
>>> +  * FW does not provide. For now ignore the data.
>>> +  */
>>
>> If you really think about it, this can be calculated.
>>
>> Channel visit shouldn't be longer than 24s, not even 12s. It's mostly
>> 20-100ms for hw_scan and up to 5s in offchannel (which is rare
>> anyway).
>>
>> Hence you can detect the wrap around AND calculate the actual time difference.
>
> These FW counters increment continuously and also across channel
> changes and between scans.
> So wrap around can happen  any time.

For foreign-channel visits this can be easily calculated since they
are never longer than the wraparound time.

BSS-channel visits (or between-scans if you will) should be just
discarded though as you suggested. I'm not fond of re-using
ar->survey_last_cycle_count though because it always stores a value
and 0 is just one of them. Adding a special meaning to 0 looks ugly to
me. I would suggest a dedicated var which tracks whether chan_info
event without _FLAG_COMPLETE came in. Something along the logic of:

 event_chan_info():
  ...
  if (flags & COMPLETE) {
     if (ar->ch_info_can_process)
        update(ar->survey);

     ar->ch_info_can_process = false;
  } else {
     ar->ch_info_can_process = true;
  }

This cleanly enforces the expectation that was assumed before, i.e.
that ch_info comes in pairs always.


Michał



More information about the ath10k mailing list