On 2024/6/17 19:50, Lingbo Kong wrote:
On 2024/6/5 14:31, Lingbo Kong wrote:
On 2024/4/26 19:21, Kalle Valo wrote:
Lingbo Kong <quic_lingbok@xxxxxxxxxxx> writes:
On 2024/4/26 0:54, Kalle Valo wrote:
Lingbo Kong <quic_lingbok@xxxxxxxxxxx> writes:
+static void ath12k_dp_tx_update_txcompl(struct ath12k *ar, struct
hal_tx_status *ts)
+{
+ struct ath12k_base *ab = ar->ab;
+ struct ath12k_peer *peer;
+ struct ath12k_sta *arsta;
+ struct ieee80211_sta *sta;
+ u16 rate;
+ u8 rate_idx = 0;
+ int ret;
+
+ spin_lock_bh(&ab->base_lock);
Did you analyse how this function, and especially taking the
base_lock,
affects performance?
The base_lock is used here because of the need to look for peers based
on the ts->peer_id when calling ath12k_peer_find_by_id() function,
which i think might affect performance.
Do i need to run a throughput test?
Ok, so to answer my question: no, you didn't do any performance
analysis. Throughput test might not be enough, for example the driver
can be used on slower systems and running the test on a fast CPU might
not reveal any problem. A proper analysis would be much better.
Hi, kalle,
I did a simple performance analysis of the
ath12k_dp_tx_update_txcompl() function on slower systems.
Firstly, i use perf tool to set dynamic tracepoints in
ath12k_dp_tx_complete_msdu() function, and then used the command of
"iperf -c ip address -w 4M -n 1G -i 1" to do traffic test.
During this process, use ./perf record -a -g to detect the performace
of the system.
Finally, compare the results with and without this patch.
without this patch
./perf report output
children self command symbol
7.28% 0.08% ksoftirqd/0 ath12k_dp_tx_complete_msdu
5.96% 0.03% swapper ath12k_dp_tx_complete_msdu
iperf output
[ 1] 0.0000-62.6712 sec 1.00 GBytes 137 Mbits/sec
with this patch
children self command symbol
7.42% 0.08% ksoftirqd/0 ath12k_dp_tx_complete_msdu
6.32% 0.03% swapper ath12k_dp_tx_complete_msdu
iperf output
[ 1] 0.0000-62.6732 sec 1.00 GBytes 137 Mbits/sec
As can be seen from the table above, with this patch, the CPU time
percentage will increase by 0.5%.
So, i think applying this patch will definitely have an impact on
system performance, but the impact is not that big and i think it can
be ignored:)
Best regards
Lingbo Kong
Hi, kalle
do you have any comments regarding the above content?:)
best regards
Lingbo Kong
hi,kalle,
In this patch, ath12k utilizes base_lock because it needs to invoke the
ath12k_peer_find_by_id() function to find the peer using peer_id, and
subsequently access ieee80211_sta through the peer. The base_lock is
used to protect data like peers.
I've contemplated an alternative approach that can avoid the use of
base_lock. we could consider using the ieee80211_find_sta_by_ifaddr()
function to directly locate ieee80211_sta based on hdr->addr1, thus
potentially eliminating the need for base_lock.
It's important to note that the ieee80211_find_sta_by_ifaddr() function
call must be placed under an RCU lock. Fortunately, the
ath12k_dp_tx_complete_msdu() function already incorporates rcu_read_lock().
I can rebase on the latest code and post v5:)
Best regards
Lingbo Kong