[PATCH] ath10k: handle cycle count wrap around
Michal Kazior
michal.kazior at tieto.com
Thu May 14 22:51:18 PDT 2015
On 14 May 2015 at 20:40, Srinivasa Duvvuri <sduvvuri at google.com> wrote:
> On Wed, May 13, 2015 at 10:25 PM, Michal Kazior <michal.kazior at tieto.com> wrote:
>> On 14 May 2015 at 03:56, Srinivasa Duvvuri <sduvvuri at google.com> wrote:
>>> When the cycle counts part of wmi_ch_info_ev_arg reach max value of
>>> 0xffffffff The HW/FW right shifts all the counters by 1 . The cycle
>>> count will be reset to 0x7fffffff and starts counting from 0x7fffffff.
>>> At the same time all the other counters will also have their values
>>> right shifted by 1 (divided by 2). There is no way to handle this odd
>>> wrap around. Detect and ignore the wrap around case.
>>> This change has 2 parts.
>>>
>>> 1) Ignore all the cycle counts , if the cycle count field wraps around.
>>> 2) With HW frequency of 88MHz the cycle count wraps around every 24
>>> seconds. the survey_last_cycle_count saved will become stale with in
>>> 24 seconds. Reset it to 0 at every scan start.
>>
>> So what does it fix from a user perspective?
>
> User gets a really large/incorrect value at random times when he dumps
> the survey data
> using iw dev wlan0 survey dump
This should be put in the commit log so anyone reading it will
understand what this patch fixes from the user point of view.
>>> Signed-off-by: Srinivasa Duvvuri <sduvvuri at chromium.org>
>>>
>>> diff --git a/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> b/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> index 720b6eb..09616bb 100644
>>> --- a/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>> +++ b/drivers/net/wireless-3.18/ath/ath10k/wmi.c
>>
>> This doesn't look like an upstream kernel patch. Rebase it for
>> github.com/kvalo/ath, please.
>>
>>
>>> @@ -1158,6 +1158,8 @@
>>> complete(&ar->scan.started);
>>> break;
>>> }
>>> + /* reset the last cycle count as the last cycle count is stale from
>>> last scan. */
>>> + ar->survey_last_cycle_count = 0;
>>> }
>>>
>>> static void ath10k_wmi_event_scan_start_failed(struct ath10k *ar)
>>> @@ -1716,21 +1718,36 @@
>>> goto exit;
>>> }
>>>
>>> - if (cmd_flags & WMI_CHAN_INFO_FLAG_COMPLETE) {
>>> + if (cmd_flags & WMI_CHAN_INFO_FLAG_COMPLETE && ar->survey_last_cycle_count) {
>>
>> Channel info events come in pairs: without and with FLAG_COMPLETE
>> respectively. Thus it is pointless to reset
>> `ar->survey_last_cycle_count` and it's pointless to use it in this
>> condition.
>
> You are right, this check seems to be not useful, but it addresses one
> case where between 2 scans
> FW generates 2 events back to back with WMI_CHAN_INFO_FLAG_COMPLETE flag set.
> here is how the events are generated from FW with 2 scans issued .
>
> scan req 1 to FW.
> scan started event from FW.
> scan bss_channel event from FW .
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
> scan foreign channel event from FW
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag cleared.
> scan bss_channel event from FW .
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
> ......
> ......
> scan foreign channel event from FW
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag cleared.
> scan bss_channel event from FW .
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
> scan event completed.
>
>
> scan req 2 to FW
> scan started event from FW.
> scan bss_channel event from FW .
> chan info event ( WMI_CHAN_INFO_EVENTID ) from FW with
> WMI_CHAN_INFO_FLAG_COMPLETE flag set.
> ............
> ............
>
>
> From the events above there is one chan info event from FW with the
> complete flag at the end of first scan
> and one chan info event from FW at the beginning of second scan. The
> logic in ath10k computes the
> difference between the 2 events . Depending upon the time at which the
> second scan is issued , the HW cycle
> count could have rolled over several times. the check above ignores
> the first chan info event.
>
> Alternatively we can fix the FW to not generate first bss chan event
> and the corresponding chan info event.
> I guess the idea of having a bss chan event at the beginning and end
> of scan followed by the chan
> info event is to provide the survey data on bss channel between the
> scans. Since these counters wrap
> around every 24 seconds, the difference is not going to be reliable any way.
Oh. I've never noticed that. I guess that's because I've mostly tested
survey with `iw scan` which scans all channels including BSS channel.
Otherwise this problem doesn't manifest because BSS-related chan-info
pair is overwritten by subsequent foreign-related chan-info.
>>> /* During scanning chan info is reported twice for each
>>> - * visited channel. The reported cycle count is global
>>> + * visited channel. The reported cycle count is global
>>> * and per-channel cycle count must be calculated */
>>>
>>> - cycle_count -= ar->survey_last_cycle_count;
>>> - rx_clear_count -= ar->survey_last_rx_clear_count;
>>> -
>>> survey = &ar->survey[idx];
>>> - survey->channel_time = WMI_CHAN_INFO_MSEC(cycle_count);
>>> - survey->channel_time_rx = WMI_CHAN_INFO_MSEC(rx_clear_count);
>>> +
>>> + if (cycle_count < ar->survey_last_cycle_count) {
>>> + /*
>>> + * Ignore the wrapped ar ust_cycnd data. When the total cycle count
>>> + * wraps around, FW/HW will right shift all the counts
>>> + * (including total cycle counts) by 1.The cycle counts wrap around
>>> + * every 24 seconds. To handle this wrap around behavior,
>>> + * we need the value of each counter at the time of wrap around which
>>> + * FW does not provide. For now ignore the data.
>>> + */
>>
>> If you really think about it, this can be calculated.
>>
>> Channel visit shouldn't be longer than 24s, not even 12s. It's mostly
>> 20-100ms for hw_scan and up to 5s in offchannel (which is rare
>> anyway).
>>
>> Hence you can detect the wrap around AND calculate the actual time difference.
>
> These FW counters increment continuously and also across channel
> changes and between scans.
> So wrap around can happen any time.
For foreign-channel visits this can be easily calculated since they
are never longer than the wraparound time.
BSS-channel visits (or between-scans if you will) should be just
discarded though as you suggested. I'm not fond of re-using
ar->survey_last_cycle_count though because it always stores a value
and 0 is just one of them. Adding a special meaning to 0 looks ugly to
me. I would suggest a dedicated var which tracks whether chan_info
event without _FLAG_COMPLETE came in. Something along the logic of:
event_chan_info():
...
if (flags & COMPLETE) {
if (ar->ch_info_can_process)
update(ar->survey);
ar->ch_info_can_process = false;
} else {
ar->ch_info_can_process = true;
}
This cleanly enforces the expectation that was assumed before, i.e.
that ch_info comes in pairs always.
Michał
More information about the ath10k
mailing list