[PATCH] wcn36xx: Fix software-driven scan

Bryan O'Donoghue bryan.odonoghue at linaro.org
Thu Aug 20 08:14:56 EDT 2020


On 20/08/2020 12:27, Loic Poulain wrote:
> HI Bryan,
> 
> On Thu, 20 Aug 2020 at 01:54, Bryan O'Donoghue
> <bryan.odonoghue at linaro.org> wrote:
>>
>> On 19/08/2020 11:52, Bryan O'Donoghue wrote:
>>> On 19/08/2020 09:15, Loic Poulain wrote:
>>>> Hi Bryan,
>>>>
>>>> On Tue, 18 Aug 2020 at 20:00, Bryan O'Donoghue
>>>> <pure.logic at nexus-software.ie> wrote:
>>>>>
>>>>> On 18/08/2020 14:34, Loic Poulain wrote:
>>>>>> On Tue, 18 Aug 2020 at 08:15, Kalle Valo <kvalo at codeaurora.org> wrote:
>>>>>>>
>>>>>>> Loic Poulain <loic.poulain at linaro.org> writes:
>>>>>>>
>>>>>>>> For software-driven scan, rely on mac80211 software scan instead
>>>>>>>> of internal driver implementation. The internal implementation
>>>>>>>> cause connection trouble since it keep the antenna busy during
>>>>>>>> the entire scan duration, moreover it's only a passive scanning
>>>>>>>> (no probe request). Therefore, let mac80211 manages sw scan.
>>>>>>>>
>>>>>>>> Note: we fallback to software scan if firmware does not report
>>>>>>>> scan offload support or if we need to scan the 5Ghz band (currently
>>>>>>>> not supported by the offload scan...).
>>>>>>>>
>>>>>>>> Signed-off-by: Loic Poulain <loic.poulain at linaro.org>
>>>>>>>> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue at linaro.org>
>>>>>>>
>>>>>>> What changed since v2? Please always include a changelog so that I
>>>>>>> don't
>>>>>>> need to guess what you have changed in the patch. No need to resend, a
>>>>>>> reply is enough.
>>>>>>
>>>>>> Yes sorry, this patch has been rebased on ath master and squashed with
>>>>>> Bryan's fix:
>>>>>>         wcn36xx: Set sw-scan chan to 0 when not associated
>>>>>> No additional changes have been made on top of the initial patches.
>>>>>>
>>>>>> Regards,
>>>>>> Loic
>>>>>>
>>>>>
>>>>> This is excruciatingly slow on Android.
>>>>>
>>>>> Android I'm finding unlike Debian where this stuff was tested, merrily
>>>>> issues scan requests every 30 seconds.
>>>>>
>>>>> For me this ends up looking like this
>>>>>
>>>>> [  162.296995] wcn36xx: mac config changed 0x00000040
>>>>> [  162.297001] wcn36xx: wcn36xx_config channel switch=1
>>>>> [  162.297005] wcn36xx: hal init scan mode 2
>>>>> [  177.492402] wcn36xx: mac config changed 0x00000040
>>>>> [  177.492409] wcn36xx: wcn36xx_config channel switch=40
>>>>> [  177.492413] wcn36xx: hal finish scan mode 2
>>>>>
>>>>> Basically it takes 15 seconds to complete.
>>>>>
>>>>> Interleaved scan seems very slow and is IMO unusable on Android
>>>>
>>>>
>>>> The software scan is driven by mac80211 which interleaves channels
>>>> scanning
>>>> with the regular data operations. In worst case, it scans during 111ms
>>>> on the
>>>> channel and switches back to the operating channel for 200ms.
>>>> So, with 35 channels to scan that would take about 11 seconds to
>>>> complete...
>>>> So yes, it's quite slow...
>>>
>>> Yes looks a scheduling thing in mac80211.
>>>
>>>> However, one remaining
>>>> thing to try would be to send an update-channel-list command to the
>>>> firmware before each offload scan (cf UPDATE_CHANNEL_LIST_REQ
>>>> in the downstream driver). That's not something I've tried yet.
>>>
>>> I've tried
>>>
>>> 1. Setting the operating frequency to a 5ghz band before the scan
>>> 2. Updating the channel list with 2g/5g channels via
>>> UPDATE_CHANNEL_LIST_REQ
>>> 3. Updating the channel list with 5g only with UPDATE_CHANNEL_LIST_REQ
>>> 4. Doing 5g channels only in the channel list of
>>> WCN36XX_HAL_START_SCAN_OFFLOAD_REQ
>>>
>>> I don't think the channel list matters, if I recall rightly I've also
>>> tried leaving out the channel list in WCN36XX_HAL_START_SCAN_OFFLOAD_REQ
>>> and get back the same set of channels in the result
>>>
>>> I guess we should take this patch anyway, since it makes wcn36xx and
>>> mac80211 agree on ownership of the antenna...
>>>
>>> ---
>>> bod
>>
>> Ah.
>>
>> If we implement ops->flush() we can slash that idle time down significantly
>>
>> static void ieee80211_scan_state_resume(struct ieee80211_local *local,
>>                                           unsigned long *next_delay)
>> {
>>
>>           if (local->ops->flush) {
>>                   ieee80211_flush_queues(local, NULL, false);
>>                   *next_delay = 0;
>>           } else
>>                   *next_delay = HZ / 10;
>> }
>>
>> Trivial hack example:
>>
>> +void wcn36xx_flush(struct ieee80211_hw *hw, struct ieee80211_vif *vif,
>> +                  u32 queues, bool drop)
>> +{
>> +}
>> +
>>    static const struct ieee80211_ops wcn36xx_ops = {
>>           .start                  = wcn36xx_start,
>>           .stop                   = wcn36xx_stop,
>> @@ -1187,7 +1192,7 @@ static const struct ieee80211_ops wcn36xx_ops = {
>>           .sta_add                = wcn36xx_sta_add,
>>           .sta_remove             = wcn36xx_sta_remove,
>>           .ampdu_action           = wcn36xx_ampdu_action,
>> -
>> +       .flush                  = wcn36xx_flush,
>>           CFG80211_TESTMODE_CMD(wcn36xx_tm_cmd)
>>    };
> 
> Interestingly, we could indeed get rid of this extra 100ms.
> Maybe a way to implement this would be to wait for the TX
> ring to be empty in dxe.
> 
>> That ~ 20 seconds is cut to just 4 ! Which to be fair seems to be the
>> kind of time it took to do the old version of the software scan.
> 
> Sure but In the below case, it seems scan occurs while you're not
> connected to an AP, which prevents interleaving of operating channel.
> The same scan while connected should be longer.

You're right, don't get excited about a fix you 'find' at 1 am.

Still having a hard time getting this one running on Android, even 
dropping the scan interval to 60 seconds.

ops->flush() helps a bit but not enough

In terms of keeping the antenna busy during the current scan - is that a 
problem if we skip over the active channel when connected to an AP ?

__ieee80211_scan_completed() seems to re-schedule work which was delayed 
while hardware scan was ongoing, so, the TX path should be OK with the 
current scan.

The issue then I guess is on the RX path. During a scan we could tune to 
the current operational channel, the AP could send data but, we would 
drop it, because we are scanning.

One solution might be, if we were to skip scanning the operational 
channel when connected to an STA, then this problem would not happen.

Because the only time we would be tuned to the operational channel would 
be when we are also in a state to receive any data the AP might send.

We should still get a notification from the firmware on 
disconnect/dis-assocation.




More information about the wcn36xx mailing list