[PATCH v4 1/3] wifi: ath12k: report station mode transmit rate

Lingbo Kong quic_lingbok at quicinc.com
Wed Jul 3 23:05:57 PDT 2024



On 2024/6/17 19:50, Lingbo Kong wrote:
> 
> 
> On 2024/6/5 14:31, Lingbo Kong wrote:
>>
>>
>> On 2024/4/26 19:21, Kalle Valo wrote:
>>> Lingbo Kong <quic_lingbok at quicinc.com> writes:
>>>
>>>> On 2024/4/26 0:54, Kalle Valo wrote:
>>>>> Lingbo Kong <quic_lingbok at quicinc.com> writes:
>>>>>
>>>>>> +static void ath12k_dp_tx_update_txcompl(struct ath12k *ar, struct
>>>>>> hal_tx_status *ts)
>>>>>> +{
>>>>>> +    struct ath12k_base *ab = ar->ab;
>>>>>> +    struct ath12k_peer *peer;
>>>>>> +    struct ath12k_sta *arsta;
>>>>>> +    struct ieee80211_sta *sta;
>>>>>> +    u16 rate;
>>>>>> +    u8 rate_idx = 0;
>>>>>> +    int ret;
>>>>>> +
>>>>>> +    spin_lock_bh(&ab->base_lock);
>>>>>
>>>>> Did you analyse how this function, and especially taking the
>>>>> base_lock,
>>>>> affects performance?
>>>>
>>>> The base_lock is used here because of the need to look for peers based
>>>> on the ts->peer_id when calling ath12k_peer_find_by_id() function,
>>>> which i think might affect performance.
>>>>
>>>> Do i need to run a throughput test?
>>>
>>> Ok, so to answer my question: no, you didn't do any performance
>>> analysis. Throughput test might not be enough, for example the driver
>>> can be used on slower systems and running the test on a fast CPU might
>>> not reveal any problem. A proper analysis would be much better.
>>>
>>
>> Hi, kalle,
>> I did a simple performance analysis of the 
>> ath12k_dp_tx_update_txcompl() function on slower systems.
>>
>> Firstly, i use perf tool to set dynamic tracepoints in 
>> ath12k_dp_tx_complete_msdu() function, and then used the command of 
>> "iperf -c ip address -w 4M -n 1G -i 1" to do traffic test.
>>
>> During this process, use ./perf record -a -g to detect the performace 
>> of the system.
>>
>> Finally, compare the results with and without this patch.
>>
>> without this patch
>> ./perf report output
>> children    self    command        symbol
>> 7.28%       0.08%      ksoftirqd/0 ath12k_dp_tx_complete_msdu
>> 5.96%      0.03%      swapper     ath12k_dp_tx_complete_msdu
>>
>> iperf output
>> [  1] 0.0000-62.6712 sec  1.00 GBytes   137 Mbits/sec
>>
>> with this patch
>> children    self       command         symbol
>> 7.42%       0.08%      ksoftirqd/0  ath12k_dp_tx_complete_msdu
>> 6.32%      0.03%      swapper      ath12k_dp_tx_complete_msdu
>>
>> iperf output
>> [  1] 0.0000-62.6732 sec  1.00 GBytes   137 Mbits/sec
>>
>> As can be seen from the table above, with this patch, the CPU time 
>> percentage will increase by 0.5%.
>>
>> So, i think applying this patch will definitely have an impact on 
>> system performance, but the impact is not that big and i think it can 
>> be ignored:)
>>
>> Best regards
>> Lingbo Kong
> 
> Hi, kalle
> do you have any comments regarding the above content?:)
> 
> best regards
> Lingbo Kong

hi,kalle,

In this patch, ath12k utilizes base_lock because it needs to invoke the 
ath12k_peer_find_by_id() function to find the peer using peer_id, and 
subsequently access ieee80211_sta through the peer. The base_lock is 
used to protect data like peers.

I've contemplated an alternative approach that can avoid the use of 
base_lock. we could consider using the ieee80211_find_sta_by_ifaddr() 
function to directly locate ieee80211_sta based on hdr->addr1, thus 
potentially eliminating the need for base_lock.

It's important to note that the ieee80211_find_sta_by_ifaddr() function 
call must be placed under an RCU lock. Fortunately, the 
ath12k_dp_tx_complete_msdu() function already incorporates rcu_read_lock().

I can rebase on the latest code and post v5:)

Best regards
Lingbo Kong





More information about the ath12k mailing list