On Wed, Aug 17, 2022, Vipin Sharma wrote: > Add command line options to run the vcpus and the main process on the > specific cpus on a host machine. This is useful as it provides > options to analyze performance based on the vcpus and dirty log worker > locations, like on the different numa nodes or on the same numa nodes. The two options should probably be separate patches, they are related but still two very distinct changes. > Signed-off-by: Vipin Sharma <vipinsh@xxxxxxxxxx> > Suggested-by: David Matlack <dmatlack@xxxxxxxxxx> > Suggested-by: Sean Christopherson <seanjc@xxxxxxxxxx> > Suggested-by: Paolo Bonzini <pbonzini@xxxxxxxxxx> > --- > > This is based on the discussion at > https://lore.kernel.org/lkml/20220801151928.270380-1-vipinsh@xxxxxxxxxx/ This can and should be captured in the changelog proper: Link: https://lore.kernel.org/lkml/20220801151928.270380-1-vipinsh@xxxxxxxxxx > @@ -348,12 +353,74 @@ static void run_test(enum vm_guest_mode mode, void *arg) > perf_test_destroy_vm(vm); > } > > +static int parse_num(const char *num_str) > +{ > + int num; > + char *end_ptr; > + > + errno = 0; > + num = (int)strtol(num_str, &end_ptr, 10); > + TEST_ASSERT(num_str != end_ptr && *end_ptr == '\0', > + "Invalid number string.\n"); > + TEST_ASSERT(errno == 0, "Conversion error: %d\n", errno); Is the paranoia truly necessary? What happens if parse_cpu_list() simply uses atoi() and is passed garbage? > + > + return num; > +} > + > +static int parse_cpu_list(const char *arg) > +{ > + char delim[2] = ","; > + char *cpu, *cpu_list; > + int i = 0, cpu_num; > + > + cpu_list = strdup(arg); > + TEST_ASSERT(cpu_list, "Low memory\n"); Heh, probably a little less than "low". Just be literal and let the user figure out why the allocation failed instead. TEST_ASSERT(cpu_list, "strdup() allocation failed\n"); > + > + cpu = strtok(cpu_list, delim); > + while (cpu) { > + cpu_num = parse_num(cpu); > + TEST_ASSERT(cpu_num >= 0, "Invalid cpu number: %d\n", cpu_num); > + vcpu_to_lcpu_map[i++] = cpu_num; > + cpu = strtok(NULL, delim); > + } > + > + free(cpu_list); The tokenization and parsing is nearly identical between parse_cpu_list() and assign_dirty_log_perf_test_cpu(). The code can be made into a common helper by passing in the destination, e.g. static int parse_cpu_list(const char *arg, cpu_set_t *cpuset, int *vcpu_map) { const char delim[] = ","; char *cpustr, *cpu_list; int i = 0, cpu; TEST_ASSERT(!!cpuset ^ !!vcpu_map); cpu_list = strdup(arg); TEST_ASSERT(cpu_list, "Low memory\n"); cpustr = strtok(cpu_list, delim); while (cpustr) { cpu = atoi(cpustr); TEST_ASSERT(cpu >= 0, "Invalid cpu number: %d\n", cpu); if (vcpu_map) vcpu_to_lcpu_map[i++] = cpu_num; else CPU_SET(cpu_num, cpuset); cpu = strtok(NULL, delim); } free(cpu_list); return i; } > @@ -383,6 +450,26 @@ static void help(char *name) > backing_src_help("-s"); > printf(" -x: Split the memory region into this number of memslots.\n" > " (default: 1)\n"); > + printf(" -c: Comma separated values of the logical CPUs which will run\n" > + " the vCPUs. Number of values should be equal to the number\n" > + " of vCPUs.\n\n" > + " Example: ./dirty_log_perf_test -v 3 -c 22,43,1\n" > + " This means that the vcpu 0 will run on the logical cpu 22,\n" > + " vcpu 1 on the logical cpu 43 and vcpu 2 on the logical cpu 1.\n" > + " (default: No cpu mapping)\n\n"); > + printf(" -d: Comma separated values of the logical CPUs on which\n" > + " dirty_log_perf_test will run. Without -c option, all of\n" > + " the vcpus and main process will run on the cpus provided here.\n" > + " This option also accepts a single cpu. (default: No cpu mapping)\n\n" > + " Example 1: ./dirty_log_perf_test -v 3 -c 22,43,1 -d 101\n" > + " Main application thread will run on logical cpu 101 and\n" > + " vcpus will run on the logical cpus 22, 43 and 1\n\n" > + " Example 2: ./dirty_log_perf_test -v 3 -d 101\n" > + " Main application thread and vcpus will run on the logical\n" > + " cpu 101\n\n" > + " Example 3: ./dirty_log_perf_test -v 3 -d 101,23,53\n" > + " Main application thread and vcpus will run on logical cpus\n" > + " 101, 23 and 53.\n"); > puts(""); > exit(0); > } > @@ -455,6 +550,13 @@ int main(int argc, char *argv[]) > } > } > I wonder if we should make -c and -d mutually exclusive. Tweak -c to include the application thread, i.e. TEST_ASSERT(nr_lcpus == nr_vcpus+1) and require 1:1 pinning for all tasks. E.g. allowing "-c ..., -d 0,1,22" seems unnecessary. > + if (nr_lcpus != -1) { > + TEST_ASSERT(nr_lcpus == nr_vcpus, > + "Number of vCPUs (%d) are not equal to number of logical cpus provided (%d).", > + nr_vcpus, nr_lcpus); > + p.vcpu_to_lcpu = vcpu_to_lcpu_map; > + } > + > TEST_ASSERT(p.iterations >= 2, "The test should have at least two iterations"); > > pr_info("Test iterations: %"PRIu64"\n", p.iterations);