Re: [PATCH V2] KVM: selftests: Take large C-state exit latency into consideration

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 12, 2024, Zide Chen wrote:
> Currently, the migration worker delays 1-10 us, assuming that one
> KVM_RUN iteration only takes a few microseconds.  But if C-state exit
> latencies are large enough, for example, hundreds or even thousands
> of microseconds on server CPUs, it may happen that it's not able to
> bring the target CPU out of C-state before the migration worker starts
> to migrate it to the next CPU.
> 
> If the system workload is light, most CPUs could be at a certain level
> of C-state, which may result in less successful migrations and fail the
> migration/KVM_RUN ratio sanity check.
> 
> This patch adds a command line option to skip the sanity check in
> this case.
> 
> Additionally, seems it's reasonable to randomize the length of usleep(),
> other than delay in a fixed pattern.

This belongs in a separate patch.  And while it's reasonable on the surface, I
doubt think it buys us anything, and it makes an already non-deterministic test
even less deterministic.  In other words, unless a random sleep time helps find
more bugs or finds the original bug faster, just drop the randomization.

> V2:
> - removed the busy loop implementation
> - add the new "-s" option

This belongs in the ignored part of the patch...
> 
> Signed-off-by: Zide Chen <zide.chen@xxxxxxxxx>

...down here.

> ---
>  tools/testing/selftests/kvm/rseq_test.c | 37 +++++++++++++++++++++++--
>  1 file changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/testing/selftests/kvm/rseq_test.c b/tools/testing/selftests/kvm/rseq_test.c
> index 28f97fb52044..515cfa32a925 100644
> --- a/tools/testing/selftests/kvm/rseq_test.c
> +++ b/tools/testing/selftests/kvm/rseq_test.c
> @@ -150,7 +150,7 @@ static void *migration_worker(void *__rseq_tid)
>  		 * Use usleep() for simplicity and to avoid unnecessary kernel
>  		 * dependencies.
>  		 */
> -		usleep((i % 10) + 1);
> +		usleep((rand() % 10) + 1);
>  	}
>  	done = true;
>  	return NULL;
> @@ -186,12 +186,35 @@ static void calc_min_max_cpu(void)
>  		       "Only one usable CPU, task migration not possible");
>  }
>  
> +static void usage(const char *name)

Uber nit, "help()" is more common than "usage()".

> @@ -254,9 +279,15 @@ int main(int argc, char *argv[])
>  	 * getcpu() to stabilize.  A 2:1 migration:KVM_RUN ratio is a fairly
>  	 * conservative ratio on x86-64, which can do _more_ KVM_RUNs than
>  	 * migrations given the 1us+ delay in the migration task.
> +	 *
> +	 * Another reason why it may have small migration:KVM_RUN ratio is that,
> +	 * on systems with large C-state exit latency, it may happen quite often
> +	 * that the scheduler is not able to wake up the target CPU before the
> +	 * vCPU thread is scheduled to another CPU.
>  	 */
> -	TEST_ASSERT(i > (NR_TASK_MIGRATIONS / 2),
> -		    "Only performed %d KVM_RUNs, task stalled too much?", i);
> +	TEST_ASSERT(skip_sanity_check || i > (NR_TASK_MIGRATIONS / 2),
> +		    "Only performed %d KVM_RUNs, task stalled too much? "
> +		    "Try to turn off C-states or run it with the -s option", i);

I think it's worth explicitly telling the user how to reduce CPU wakeup latency.
Also, are C-states called that on other architectures?  E.g. maybe this to avoid
confusing the user?  Not a big deal, e.g. I've no objection whatsoever to the
comment, but it seems easy enough to avoid confusing the user.

		    "Try setting /dev/cpu_dma_latency to reduce CPU wakeup latency, "
		    "or run with -s to skip this sanity check", i);




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux