[PATCH v4 1/3] wifi: ath12k: report station mode transmit rate
Lingbo Kong
quic_lingbok at quicinc.com
Wed Jul 3 23:05:57 PDT 2024
On 2024/6/17 19:50, Lingbo Kong wrote:
>
>
> On 2024/6/5 14:31, Lingbo Kong wrote:
>>
>>
>> On 2024/4/26 19:21, Kalle Valo wrote:
>>> Lingbo Kong <quic_lingbok at quicinc.com> writes:
>>>
>>>> On 2024/4/26 0:54, Kalle Valo wrote:
>>>>> Lingbo Kong <quic_lingbok at quicinc.com> writes:
>>>>>
>>>>>> +static void ath12k_dp_tx_update_txcompl(struct ath12k *ar, struct
>>>>>> hal_tx_status *ts)
>>>>>> +{
>>>>>> + struct ath12k_base *ab = ar->ab;
>>>>>> + struct ath12k_peer *peer;
>>>>>> + struct ath12k_sta *arsta;
>>>>>> + struct ieee80211_sta *sta;
>>>>>> + u16 rate;
>>>>>> + u8 rate_idx = 0;
>>>>>> + int ret;
>>>>>> +
>>>>>> + spin_lock_bh(&ab->base_lock);
>>>>>
>>>>> Did you analyse how this function, and especially taking the
>>>>> base_lock,
>>>>> affects performance?
>>>>
>>>> The base_lock is used here because of the need to look for peers based
>>>> on the ts->peer_id when calling ath12k_peer_find_by_id() function,
>>>> which i think might affect performance.
>>>>
>>>> Do i need to run a throughput test?
>>>
>>> Ok, so to answer my question: no, you didn't do any performance
>>> analysis. Throughput test might not be enough, for example the driver
>>> can be used on slower systems and running the test on a fast CPU might
>>> not reveal any problem. A proper analysis would be much better.
>>>
>>
>> Hi, kalle,
>> I did a simple performance analysis of the
>> ath12k_dp_tx_update_txcompl() function on slower systems.
>>
>> Firstly, i use perf tool to set dynamic tracepoints in
>> ath12k_dp_tx_complete_msdu() function, and then used the command of
>> "iperf -c ip address -w 4M -n 1G -i 1" to do traffic test.
>>
>> During this process, use ./perf record -a -g to detect the performace
>> of the system.
>>
>> Finally, compare the results with and without this patch.
>>
>> without this patch
>> ./perf report output
>> children self command symbol
>> 7.28% 0.08% ksoftirqd/0 ath12k_dp_tx_complete_msdu
>> 5.96% 0.03% swapper ath12k_dp_tx_complete_msdu
>>
>> iperf output
>> [ 1] 0.0000-62.6712 sec 1.00 GBytes 137 Mbits/sec
>>
>> with this patch
>> children self command symbol
>> 7.42% 0.08% ksoftirqd/0 ath12k_dp_tx_complete_msdu
>> 6.32% 0.03% swapper ath12k_dp_tx_complete_msdu
>>
>> iperf output
>> [ 1] 0.0000-62.6732 sec 1.00 GBytes 137 Mbits/sec
>>
>> As can be seen from the table above, with this patch, the CPU time
>> percentage will increase by 0.5%.
>>
>> So, i think applying this patch will definitely have an impact on
>> system performance, but the impact is not that big and i think it can
>> be ignored:)
>>
>> Best regards
>> Lingbo Kong
>
> Hi, kalle
> do you have any comments regarding the above content?:)
>
> best regards
> Lingbo Kong
hi,kalle,
In this patch, ath12k utilizes base_lock because it needs to invoke the
ath12k_peer_find_by_id() function to find the peer using peer_id, and
subsequently access ieee80211_sta through the peer. The base_lock is
used to protect data like peers.
I've contemplated an alternative approach that can avoid the use of
base_lock. we could consider using the ieee80211_find_sta_by_ifaddr()
function to directly locate ieee80211_sta based on hdr->addr1, thus
potentially eliminating the need for base_lock.
It's important to note that the ieee80211_find_sta_by_ifaddr() function
call must be placed under an RCU lock. Fortunately, the
ath12k_dp_tx_complete_msdu() function already incorporates rcu_read_lock().
I can rebase on the latest code and post v5:)
Best regards
Lingbo Kong
More information about the ath12k
mailing list