Re: [kvm-unit-tests PATCH 10/10] arm64: gic: Use IPI test checking for the LPI tests

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Alexandru,
On 11/25/20 4:51 PM, Alexandru Elisei wrote:
> The LPI code validates a result similarly to the IPI tests, by checking if
> the target CPU received the interrupt with the expected interrupt number.
> However, the LPI tests invent their own way of checking the test results by
> creating a global struct (lpi_stats), using a separate interrupt handler
> (lpi_handler) and test function (check_lpi_stats).
> 
> There are several areas that can be improved in the LPI code, which are
> already covered by the IPI tests:
> 
> - check_lpi_stats() doesn't take into account that the target CPU can
>   receive the correct interrupt multiple times.
> - check_lpi_stats() doesn't take into the account the scenarios where all
>   online CPUs can receive the interrupt, but the target CPU is the last CPU
>   that touches lpi_stats.observed.
> - Insufficient or missing memory synchronization.
> 
> Instead of duplicating code, let's convert the LPI tests to use
> check_acked() and the same interrupt handler as the IPI tests, which has
> been renamed to irq_handler() to avoid any confusion.
> 
> check_lpi_stats() has been replaced with check_acked() which, together with
> using irq_handler(), instantly gives us more correctness checks and proper
> memory synchronization between threads. lpi_stats.expected has been
> replaced by the CPU mask and the expected interrupt number arguments to
> check_acked(), with no change in semantics.
> 
> lpi_handler() aborted the test if the interrupt number was not an LPI. This
> was changed in favor of allowing the test to continue, as it will fail in
> check_acked(), but possibly print information useful for debugging. If the
> test receives spurious interrupts, those are reported via report_info() at
> the end of the test for consistency with the IPI tests, which don't treat
> spurious interrupts as critical errors.
> 
> In the spirit of code reuse, secondary_lpi_tests() has been replaced with
> ipi_recv() because the two are now identical; ipi_recv() has been renamed
> to irq_recv(), similarly to irq_handler(), to avoid confusion.
> 
> CC: Eric Auger <eric.auger@xxxxxxxxxx>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@xxxxxxx>
> ---
> With this change, I get the following failure for its-trigger on a
> rockpro64 (running on the little cores):
> 
> $ taskset -c 0-3 arm/run arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger
> /usr/bin/qemu-system-aarch64 -nodefaults -machine virt,gic-version=host,accel=kvm -cpu host -device virtio-serial-device -device virtconsole,chardev=ctd -chardev testdev,id=ctd -device pci-testdev -display none -serial stdio -kernel arm/gic.flat -smp 4 -machine gic-version=3 -append its-trigger # -initrd /tmp/tmp.wWW0iJY6DS
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=1
> ITS: MAPD devid=7 size = 0x8 itt=0x403b0000 valid=1
> MAPC col_id=3 target_addr = 0x30000 valid=1
> MAPC col_id=2 target_addr = 0x20000 valid=1
> INVALL col_id=2
> INVALL col_id=3
> MAPTI dev_id=2 event_id=20 -> phys_id=8195, col_id=3
> MAPTI dev_id=7 event_id=255 -> phys_id=8196, col_id=2
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: int: dev=2, eventid=20  -> lpi= 8195, col=3
> INT dev_id=7 event_id=255
> PASS: gicv3: its-trigger: int: dev=7, eventid=255 -> lpi= 8196, col=2
> INV dev_id=2 event_id=20
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 does not trigger any LPI
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: inv/invall: dev2/eventid=20 still does not trigger any LPI
> INVALL col_id=3
> INT dev_id=2 event_id=20
> INFO: gicv3: its-trigger: inv/invall: ACKS: missing=0 extra=1 unexpected=0
> FAIL: gicv3: its-trigger: inv/invall: dev2/eventid=20 now triggers an LPI
> ITS: MAPD devid=2 size = 0x8 itt=0x403a0000 valid=0
> INT dev_id=2 event_id=20
> PASS: gicv3: its-trigger: mapd valid=false: no LPI after device unmap
> SUMMARY: 6 tests, 1 unexpected failures
> 
> The reason for the failure is that the test "dev2/eventid=20 now triggers
> an LPI" triggers 2 LPIs, not one. This behavior was present before this
> patch, but it was ignored because check_lpi_stats() wasn't looking at the
> acked array.
> 
> I'm not familiar with the ITS so I'm not sure if this is expected, if the
> test is incorrect or if there is something wrong with KVM emulation.
> 
> Did some more testing on an Ampere eMAG (fast out-of-order cores) using
> qemu and kvmtool and Linux v5.8, here's what I found:
> 
> - Using qemu and gic.flat built from *master*: error encountered 864 times
>   out of 1088 runs.
> - Using qemu: error encountered 852 times out of 1027 runs.
> - Using kvmtool: error encountered 8164 times out of 10602 runs.
> 
> Looks to me like it's consistent between master and this series, and
> between qemu and kvmtool.
> 
> Here's the diff that I used for testing master (I removed the diff line
> because it causes trouble when applying the main patch):
> 
> @@ -772,8 +772,12 @@ static void test_its_trigger(void)
>         /* Now call the invall and check the LPI hits */
>         its_send_invall(col3);
>         lpi_stats_expect(3, 8195);
> +       acked[3] = 0;
> +       dsb(ishst);
>         its_send_int(dev2, 20);
>         check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +       report_info("acked[3] = %d", acked[3]);
> +       report(acked[3] == 1, "dev2/eventid=20 received one interrupt");
>  
>         report_prefix_pop();
>  
> 
>  arm/gic.c | 185 ++++++++++++++++++++++++++----------------------------
>  1 file changed, 88 insertions(+), 97 deletions(-)
> 
> diff --git a/arm/gic.c b/arm/gic.c
> index da7b42da5449..6e93da80fe0d 100644
> --- a/arm/gic.c
> +++ b/arm/gic.c
> @@ -111,7 +111,7 @@ static bool check_acked(cpumask_t *mask, int sender, int irqnum)
>  		}
>  		if (!acked[cpu])
>  			continue;
> -		smp_rmb(); /* pairs with smp_wmb in ipi_handler */
> +		smp_rmb(); /* pairs with smp_wmb in irq_handler */
>  
>  		if (has_gicv2 && irq_sender[cpu] != sender) {
>  			report_info("cpu%d received IPI from wrong sender %d",
> @@ -149,11 +149,12 @@ static void check_spurious(void)
>  static int gic_get_sender(int irqstat)
>  {
>  	if (gic_version() == 2)
> +		/* GICC_IAR.CPUID is RAZ for non-SGIs */
>  		return (irqstat >> 10) & 7;
>  	return -1;
>  }
>  
> -static void ipi_handler(struct pt_regs *regs __unused)
> +static void irq_handler(struct pt_regs *regs __unused)
>  {
>  	u32 irqstat = gic_read_iar();
>  	u32 irqnr = gic_iar_irqnr(irqstat);
> @@ -192,75 +193,6 @@ static void setup_irq(irq_handler_fn handler)
>  }
>  
>  #if defined(__aarch64__)
> -struct its_event {
> -	int cpu_id;
> -	int lpi_id;
> -};
> -
> -struct its_stats {
> -	struct its_event expected;
> -	struct its_event observed;
> -};
> -
> -static struct its_stats lpi_stats;
> -
> -static void lpi_handler(struct pt_regs *regs __unused)
> -{
> -	u32 irqstat = gic_read_iar();
> -	int irqnr = gic_iar_irqnr(irqstat);
> -
> -	gic_write_eoir(irqstat);
> -	assert(irqnr >= 8192);
> -	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
> -	lpi_stats.observed.cpu_id = smp_processor_id();
> -	lpi_stats.observed.lpi_id = irqnr;
> -	acked[lpi_stats.observed.cpu_id]++;
> -	smp_wmb(); /* pairs with rmb in check_lpi_stats */
> -}
> -
> -static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
> -{
> -	lpi_stats.expected.cpu_id = exp_cpu_id;
> -	lpi_stats.expected.lpi_id = exp_lpi_id;
> -	lpi_stats.observed.cpu_id = -1;
> -	lpi_stats.observed.lpi_id = -1;
> -	smp_wmb(); /* pairs with rmb in handler */
> -}
> -
> -static void check_lpi_stats(const char *msg)
> -{
> -	int i;
> -
> -	for (i = 0; i < 50; i++) {
> -		mdelay(100);
> -		smp_rmb(); /* pairs with wmb in lpi_handler */
> -		if (lpi_stats.observed.cpu_id == lpi_stats.expected.cpu_id &&
> -		    lpi_stats.observed.lpi_id == lpi_stats.expected.lpi_id) {
> -			report(true, "%s", msg);
> -			return;
> -		}
> -	}
> -
> -	if (lpi_stats.observed.cpu_id == -1 && lpi_stats.observed.lpi_id == -1) {
> -		report_info("No LPI received whereas (cpuid=%d, intid=%d) "
> -			    "was expected", lpi_stats.expected.cpu_id,
> -			    lpi_stats.expected.lpi_id);
> -	} else {
> -		report_info("Unexpected LPI (cpuid=%d, intid=%d)",
> -			    lpi_stats.observed.cpu_id,
> -			    lpi_stats.observed.lpi_id);
> -	}
> -	report(false, "%s", msg);
> -}
> -
> -static void secondary_lpi_test(void)
> -{
> -	setup_irq(lpi_handler);
> -	cpumask_set_cpu(smp_processor_id(), &ready);
> -	while (1)
> -		wfi();
> -}
> -
>  static void check_lpi_hits(int *expected, const char *msg)
>  {
>  	bool pass = true;
> @@ -347,7 +279,7 @@ static void ipi_test_smp(void)
>  
>  static void ipi_send(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	wait_on_ready();
>  	ipi_test_self();
>  	ipi_test_smp();
> @@ -355,9 +287,9 @@ static void ipi_send(void)
>  	exit(report_summary());
>  }
>  
> -static void ipi_recv(void)
> +static void irq_recv(void)
>  {
> -	setup_irq(ipi_handler);
> +	setup_irq(irq_handler);
>  	cpumask_set_cpu(smp_processor_id(), &ready);
>  	while (1)
>  		wfi();
> @@ -368,7 +300,7 @@ static void ipi_test(void *data __unused)
>  	if (smp_processor_id() == IPI_SENDER)
>  		ipi_send();
>  	else
> -		ipi_recv();
> +		irq_recv();
>  }
>  
>  static struct gic gicv2 = {
> @@ -698,12 +630,12 @@ static int its_prerequisites(int nb_cpus)
>  
>  	stats_reset();
>  
> -	setup_irq(lpi_handler);
> +	setup_irq(irq_handler);
>  
>  	for_each_present_cpu(cpu) {
>  		if (cpu == 0)
>  			continue;
> -		smp_boot_secondary(cpu, secondary_lpi_test);
> +		smp_boot_secondary(cpu, irq_recv);
>  	}
>  	wait_on_ready();
>  
> @@ -757,6 +689,7 @@ static void test_its_trigger(void)
>  {
>  	struct its_collection *col3;
>  	struct its_device *dev2, *dev7;
> +	cpumask_t mask;
>  
>  	if (its_setup1())
>  		return;
> @@ -767,13 +700,27 @@ static void test_its_trigger(void)
>  
>  	report_prefix_push("int");
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	/*
> +	 * its_send_int() is missing the synchronization from the GICv3 IPI
> +	 * trigger functions.
> +	 */
> +	wmb();
so don't you want to add it in __its_send_int instead?

Eric
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev=2, eventid=20  -> lpi= 8195, col=3");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev=7, eventid=255 -> lpi= 8196, col=2");
>  
>  	report_prefix_pop();
>  
> @@ -786,9 +733,13 @@ static void test_its_trigger(void)
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
>  	its_send_inv(dev2, 20);
>  
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 does not trigger any LPI");
>  
>  	/*
>  	 * re-enable the LPI but willingly do not call invall
> @@ -796,15 +747,24 @@ static void test_its_trigger(void)
>  	 * The LPI should not hit
>  	 */
>  	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1),
> +			"dev2/eventid=20 still does not trigger any LPI");
>  
>  	/* Now call the invall and check the LPI hits */
>  	its_send_invall(col3);
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 now triggers an LPI");
>  
>  	report_prefix_pop();
>  
> @@ -815,9 +775,14 @@ static void test_its_trigger(void)
>  	 */
>  
>  	its_send_mapd(dev2, false);
> -	lpi_stats_expect(-1, -1);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("no LPI after device unmap");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, -1, -1), "no LPI after device unmap");
> +
> +	check_spurious();
>  	report_prefix_pop();
>  }
>  
> @@ -825,6 +790,7 @@ static void test_its_migration(void)
>  {
>  	struct its_device *dev2, *dev7;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  
>  	if (its_setup1()) {
>  		test_skipped = true;
> @@ -841,13 +807,25 @@ do_migrate:
>  	if (test_skipped)
>  		return;
>  
> -	lpi_stats_expect(3, 8195);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(3, &mask);
>  	its_send_int(dev2, 20);
> -	check_lpi_stats("dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8195),
> +			"dev2/eventid=20 triggers LPI 8195 on PE #3 after migration");
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid=255 triggers LPI 8196 on PE #2 after migration");
> +
> +	check_spurious();
>  }
>  
>  #define ERRATA_UNMAPPED_COLLECTIONS "ERRATA_8c58be34494b"
> @@ -857,6 +835,7 @@ static void test_migrate_unmapped_collection(void)
>  	struct its_collection *col = NULL;
>  	struct its_device *dev2 = NULL, *dev7 = NULL;
>  	bool test_skipped = false;
> +	cpumask_t mask;
>  	int pe0 = 0;
>  	u8 config;
>  
> @@ -891,17 +870,29 @@ do_migrate:
>  	its_send_mapc(col, true);
>  	its_send_invall(col);
>  
> -	lpi_stats_expect(2, 8196);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(2, &mask);
>  	its_send_int(dev7, 255);
> -	check_lpi_stats("dev7/eventid= 255 triggered LPI 8196 on PE #2");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8196),
> +			"dev7/eventid= 255 triggered LPI 8196 on PE #2");
>  
>  	config = gicv3_lpi_get_config(8192);
>  	report(config == LPI_PROP_DEFAULT,
>  	       "Config of LPI 8192 was properly migrated");
>  
> -	lpi_stats_expect(pe0, 8192);
> +	stats_reset();
> +	wmb();
> +	cpumask_clear(&mask);
> +	cpumask_set_cpu(pe0, &mask);
>  	its_send_int(dev2, 0);
> -	check_lpi_stats("dev2/eventid = 0 triggered LPI 8192 on PE0");
> +	wait_for_interrupts(&mask);
> +	report(check_acked(&mask, 0, 8192),
> +			"dev2/eventid = 0 triggered LPI 8192 on PE0");
> +
> +	check_spurious();
>  }
>  
>  static void test_its_pending_migration(void)
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux