回复: Re: [kvm-unit-tests 1/1] arm64: microbench: Move the read of the count register and the ISB operation out of the while loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Your suggestions have been greatly appreciated, and they have been incorporated into the new patch, which now includes the "dsb" instruction.



-----原始邮件-----
发件人: "Alexandru Elisei" <alexandru.elisei@xxxxxxx>
发送时间: 2023-11-01 19:04:23
收件人: "何琼" <heqiong1557@xxxxxxxxxxxxxx>
抄送: "kvm" <kvm@xxxxxxxxxxxxxxx>
主题: Re: [kvm-unit-tests 1/1] arm64: microbench: Move the read of the count register and the ISB operation out of the while loop


Hi,

Comments on the patch itself.

On Wed, Nov 01, 2023 at 04:25:39PM +0800, 何琼 wrote:
> hi,
> 
> This patch mainly includes the following content.
> 
> Reducing the impact of the cntvct_el0 register and isb() operation on microbenchmark test results to improve testing accuracy and reduce latency in test results.
> 
> 
> 
> 
> 
> 
> 
> Test in kunpeng920,
> 
> Test results before applying the patch:
> 
> [root@localhost tests]# ./micro-bench
> 
> 
> BUILD_HEAD=767629ca
> 
> 
> Test marked not to be run by default, are you sure (y/N)? y
> 
> 
> timeout -k 1s --foreground 90s numactl -C 0-3 -m 0 /usr/libexec/qemu-kvm -nodefaults -machine virt,gic-version=host -accel kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel /tmp/tmp.y4c4YHIprP -smp 2 # -initrd /tmp/tmp.KLLmjTuq2d
> 
> 
> Timer Frequency 100000000 Hz (Output in microseconds)
> 
> 
> 
> 
> 
> name                                    total ns                         avg ns
> 
> 
> --------------------------------------------------------------------------------------------
> 
> 
> hvc                                      26774980.0                      408.0
> 
> 
> mmio_read_user                 151183350.0                    2306.0
> 
> 
> mmio_read_vgic                  41849830.0                     638.0
> 
> 
> eoi                                       1735610.0                       26.0
> 
> 
> ipi                                        111260770.0                   1697.0
> 
> 
> ipi_hw test skipped
> 
> 
> lpi                                        142124570.0                   2168.0
> 
> 
> timer_10ms                          466660.0                        1822.0
> 
> 
> 
> 
> 
> EXIT: STATUS=1
> 
> 
> PASS micro-bench
> 
> 
> [root@localhost tests]#
> 
> 
> 
> 
> 
> Test results after applying the patch:
> 
> [root@localhost kvm-unit-tests]# cd tests/
> 
> 
> [root@localhost tests]# ./micro-bench
> 
> 
> BUILD_HEAD=767629ca
> 
> 
> Test marked not to be run by default, are you sure (y/N)? y
> 
> 
> timeout -k 1s --foreground 90s numactl -C 0-3 -m 0 /usr/libexec/qemu-kvm -nodefaults -machine virt,gic-version=host -accel kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel /tmp/tmp.FiBID6KLxB -smp 2 # -initrd /tmp/tmp.oSKZeugleF
> 
> 
> Timer Frequency 100000000 Hz (Output in microseconds)
> 
> 
> 
> 
> 
> name                                    total ns                         avg ns
> 
> 
> --------------------------------------------------------------------------------------------
> 
> 
> hvc                                  26721040.0                        407.0
> 
> 
> mmio_read_user             150824560.0                      2301.0
> 
> 
> mmio_read_vgic              41845380.0                       638.0
> 
> 
> eoi                                   1109180.0                         16.0
> 
> 
> ipi                                    106062150.0                     1618.0
> 
> 
> ipi_hw test skipped
> 
> 
> lpi                                    141700760.0                    2162.0
> 
> 
> timer_10ms                      470870.0                         1839.0
> 
> 
> 
> 
> 
> EXIT: STATUS=1
> 
> 
> PASS micro-bench
> 
> 
> [root@localhost tests]#
> 
> 
> 
> 
> 
> 
> 
> 
> Test in phytium S2500,
> 
> Test results before applying the patch:
> 
> [root@primecontroller tests]# ./micro-bench
> 
> 
> BUILD_HEAD=518cd47c
> 
> 
> Test marked not to be run by default, are you sure (y/N)? y
> 
> 
> timeout -k 1s --foreground 90s numactl -C 0-3 -m 0 /usr/local/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host -accel kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel /tmp/tmp.lrJJqSuLmN -smp 2 # -initrd /tmp/tmp.s18C3k2jfO
> 
> 
> Timer Frequency 50000000 Hz (Output in microseconds)
> 
> 
> 
> 
> 
> name                                    total ns                         avg ns
> 
> 
> --------------------------------------------------------------------------------------------
> 
> 
> hvc                                    100668780.0                    1536.0
> 
> 
> mmio_read_user               472806800.0                     7214.0
> 
> 
> mmio_read_vgic               140912320.0                      2150.0
> 
> 
> eoi                                     2972280.0                         45.0
> 
> 
> ipi                                      326332780.0                     4979.0
> 
> 
> ipi_hw test skipped
> 
> 
> lpi                                      359226600.0                     5481.0
> 
> 
> timer_10ms                        1271960.0                         4968.0
> 
> 
> 
> 
> 
> EXIT: STATUS=1
> 
> 
> PASS micro-bench
> 
> 
> [root@primecontroller tests]#
> 
> 
> 
> 
> 
> 
> 
> 
> Test results after applying the patch:
> 
> [root@primecontroller tests]# ./micro-bench
> 
> 
> BUILD_HEAD=518cd47c
> 
> 
> Test marked not to be run by default, are you sure (y/N)? y
> 
> 
> timeout -k 1s --foreground 90s numactl -C 0-3 -m 0 /usr/local/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host -accel kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel /tmp/tmp.IsEtcs1W1g -smp 2 # -initrd /tmp/tmp.885IpeoGw4
> 
> 
> Timer Frequency 50000000 Hz (Output in microseconds)
> 
> 
> 
> 
> 
> name                                    total ns                         avg ns
> 
> 
> --------------------------------------------------------------------------------------------
> 
> 
> hvc                                      99490080.0                    1518.0
> 
> 
> mmio_read_user                 474781300.0                   7244.0
> 
> 
> mmio_read_vgic                 140470760.0                    2143.0
> 
> 
> eoi                                      1693260.0                        25.0
> 
> 
> ipi                                       323551200.0                    4936.0
> 
> 
> ipi_hw test skipped
> 
> 
> lpi                                       355690620.0                    5427.0
> 
> 
> timer_10ms                        1318540.0                        5150.0
> 
> 
> 
> 
> 
> EXIT: STATUS=1
> 
> 
> PASS micro-bench
> 
> 
> [root@primecontroller tests]#
> 
> 
> 
> 
> 
> 
> 
> 
> 

> From 518cd47c33fce60ef86ed66dfa9e904b66499933 Mon Sep 17 00:00:00 2001
> From: heqiong <heqiong1557@xxxxxxxxxxxxxx>
> Date: Wed, 1 Nov 2023 15:06:28 +0800
> Subject: [kvm-unit-tests 1/1] arm64: microbench: Move the read of the count
>  register and the ISB operation out of the while loop
> 
> Reducing the impact of the cntvct_el0 register and isb() operation
> on microbenchmark test results to improve testing accuracy and reduce
> latency in test results.
> ---
>  arm/micro-bench.c | 13 +++++++------
>  1 file changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> index fbe59d03..ee5b9ca0 100644
> --- a/arm/micro-bench.c
> +++ b/arm/micro-bench.c
> @@ -346,17 +346,18 @@ static void loop_test(struct exit_test *test)
>  		}
>  	}
>  
> +	start = read_sysreg(cntpct_el0);
> +	isb();
>  	while (ntimes < test->times && total_ns.ns < NS_5_SECONDS) {
> -		isb();
> -		start = read_sysreg(cntvct_el0);
>  		test->exec();
> -		isb();
> -		end = read_sysreg(cntvct_el0);
>  
>  		ntimes++;
> -		total_ticks += (end - start);
> -		ticks_to_ns_time(total_ticks, &total_ns);
>  	}
> +	isb();
> +	end = read_sysreg(cntpct_el0);
> +
> +	total_ticks = end - start;
> +	ticks_to_ns_time(total_ticks, &total_ns);

A few notes:

* The counter that is being used has been changed from the physical to the
  virtual counter. Accesses to the physical counter trap on nVHE systems.
  That might not be desirable if what you're after is to reduce latency.

* You need an ISB before reading 'start', otherwise the counter read might be
  reworded earlier in program order.

* Memory loads or stores are not order by using an ISB. If there are memory
  accesses before 'start' is read, you probably want them to be finished before
  the counter is read. Similarly, I don't think there are any restrictions on
  what the test->exec() function is allowed to do, so there might be memory
  accesses as part of the test.

I suggest something like this:

	dsb();	// Wait for loads and stores to complete.
	isb();	// Order the counter read after the DSB.
	start = read_sysreg(cntvct_el0);
	isb();	// Order the counter read before the loop.
	// No DSB needed, as per ARM DDI 0487J.a, page D11-5991.

	/* test loop */

	dsb();	// Wait for loads and stores to complete.
	isb();	// Order the counter read after the DSB.
	end = read_sysreg(cnvct_el0);
	// No DSB or ISB needed, as per ARM DDI 0487J.a, page D11-5991.

Thanks,
Alex

>
>  	if (test->post) {
>  		test->post(ntimes, &total_ticks);
> -- 
> 2.31.1
> 


信息安全声明:本邮件包含信息归发件人所在组织所有,发件人所在组织对该邮件拥有所有权利。请接收者注意保密,未经发件人书面许可,不得向任何第三方组织和个人透露本邮件所含信息。
Information Security Notice: The information contained in this mail is solely property of the sender's organization.This mail communication is confidential.Recipients named above are obligated to maintain secrecy and are not permitted to disclose the contents of this communication to others.
From c803c6ac05dfcbc87f26697a459a6aa60b010030 Mon Sep 17 00:00:00 2001
From: heqiong <heqiong1557@xxxxxxxxxxxxxx>
Date: Thu, 2 Nov 2023 14:15:54 +0800
Subject: [kvm-unit-tests 1/1] arm64: microbench: Move the read of the count
 register and the ISB operation out of the while loop

Reducing the impact of the cntvct_el0 register and isb() operation
on microbenchmark test results to improve testing accuracy and reduce
latency in test results.
---
 arm/micro-bench.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arm/micro-bench.c b/arm/micro-bench.c
index fbe59d03..6b940d56 100644
--- a/arm/micro-bench.c
+++ b/arm/micro-bench.c
@@ -346,17 +346,21 @@ static void loop_test(struct exit_test *test)
 		}
 	}
 
+	dsb(ish);
+	isb();
+	start = read_sysreg(cntpct_el0);
+	isb();
 	while (ntimes < test->times && total_ns.ns < NS_5_SECONDS) {
-		isb();
-		start = read_sysreg(cntvct_el0);
 		test->exec();
-		isb();
-		end = read_sysreg(cntvct_el0);
 
 		ntimes++;
-		total_ticks += (end - start);
-		ticks_to_ns_time(total_ticks, &total_ns);
 	}
+	dsb(ish);
+	isb();
+	end = read_sysreg(cntpct_el0);
+
+	total_ticks = end - start;
+	ticks_to_ns_time(total_ticks, &total_ns);
 
 	if (test->post) {
 		test->post(ntimes, &total_ticks);
-- 
2.31.1


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux