On Fri, Jan 20, 2023 at 6:23 AM Ricardo Koller <ricarkol@xxxxxxxxxx> wrote: > > On Thu, Jan 19, 2023 at 03:48:14PM -0800, Ben Gardon wrote: > > On Thu, Jan 19, 2023 at 2:56 PM David Matlack <dmatlack@xxxxxxxxxx> wrote: > > ... > > > > +static int NR_VCPUS = 2; > > > > +static int NR_SLOTS = 2; > > > > +static int NR_ITERATIONS = 2; > > > > > > These should be macros or at least const? > > > > Yikes, woops, that was a basic mistake. > > > > > > > > > +static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE; > > > > + > > > > +/* Host variables */ > > > > > > What does "Host variables" mean? (And why is guest_percpu_mem_size not a > > > "Host variable"?) > > > > > > I imagine this is copy-pasta from a test that has some global variables > > > that are used by guest code? If that's correct, it's probably best to > > > just drop this comment. > > > > Yeah, shameful copypasta. I'll drop it. > > > > > > > > > +static u64 dirty_log_manual_caps; > > ... > > > > > > + /* > > > > + * Incrementing the iteration number will start the vCPUs > > > > + * dirtying memory again. > > > > + */ > > > > + iteration++; > > > > + > > > > + for (i = 0; i < NR_VCPUS; i++) { > > > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) > > > > + != iteration) > > > > + ; > > > > + } > > > > + > > > > + pr_debug("\nGetting stats after dirtying memory on pass %d:\n", iteration); > > > > + get_page_stats(vm, &stats_dirty_pass[iteration - 1]); > > > > > > Incrementing iteration, waiting for vCPUs, and grabbing stats is > > > repeated below. Throw it in a helper function? > > > > Good call. > > > > > > > > > + > > > > + memstress_get_dirty_log(vm, bitmaps, NR_SLOTS); > > > > + > > > > + if (dirty_log_manual_caps) { > > > > + memstress_clear_dirty_log(vm, bitmaps, NR_SLOTS, pages_per_slot); > > > > + > > > > + pr_debug("\nGetting stats after clearing dirty log pass %d:\n", iteration); > > > > + get_page_stats(vm, &stats_clear_pass[iteration - 1]); > > > > + } > > > > + } > > > > + > > > > + /* Disable dirty logging */ > > > > + memstress_disable_dirty_logging(vm, NR_SLOTS); > > > > + > > > > + pr_debug("\nGetting stats after disabling dirty logging:\n"); > > > > + get_page_stats(vm, &stats_dirty_logging_disabled); > > > > + > > > > + /* Run vCPUs again to fault pages back in. */ > > > > + iteration++; > > > > + for (i = 0; i < NR_VCPUS; i++) { > > > > + while (READ_ONCE(vcpu_last_completed_iteration[i]) != iteration) > > > > + ; > > > > + } > > > > + > > > > + pr_debug("\nGetting stats after repopulating memory:\n"); > > > > + get_page_stats(vm, &stats_repopulated); > > > > + > > > > + /* > > > > + * Tell the vCPU threads to quit. No need to manually check that vCPUs > > > > + * have stopped running after disabling dirty logging, the join will > > > > + * wait for them to exit. > > > > + */ > > > > + host_quit = true; > > > > + memstress_join_vcpu_threads(NR_VCPUS); > > > > + > > > > + memstress_free_bitmaps(bitmaps, NR_SLOTS); > > > > + memstress_destroy_vm(vm); > > > > + > > > > + /* Make assertions about the page counts. */ > > > > + total_4k_pages = stats_populated.pages_4k; > > > > + total_4k_pages += stats_populated.pages_2m * 512; > > > > + total_4k_pages += stats_populated.pages_1g * 512 * 512; > > > > + > > > > + /* > > > > + * Check that all huge pages were split. Since large pages can only > > > > + * exist in the data slot, and the vCPUs should have dirtied all pages > > > > + * in the data slot, there should be no huge pages left after splitting. > > > > + * Splitting happens at dirty log enable time without > > > > + * KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 and after the first clear pass > > > > + * with that capability. > > > > + */ > > > > + if (dirty_log_manual_caps) { > > > > + TEST_ASSERT(stats_clear_pass[0].hugepages == 0, > > > > > > Consider using ASSERT_EQ() to simplify these checks. It will > > > automatically print out the values for you, but you'll lose the > > > contextual error message ("Unexpected huge page count after > > > splitting..."). But maybe we could add support for a custom extra error > > > string? > > > > > > __ASSERT_EQ(stats_clear_pass[0].hugepages, 0, > > > "Expected 0 hugepages after splitting"); > > > > > > Or use a comment to document the context for the assertion. Whoever is > > > debugging a failure is going to come look at the selftest code no matter > > > what. > > > > > > I think I prefer ASSERT_EQ() + comment, especially since the comment > > > pretty much already exists above. > > > > That's fair. I prefer the way it is because the resulting error > > message is a lot easier to read and I don't need to look at the test > > code to decrypt it. If I'm developing a feature and just running all > > tests, it's nice to not have to track down the test source code. > > > > > > > > > + "Unexpected huge page count after splitting. Expected 0, got %ld", > > > > + stats_clear_pass[0].hugepages); > > > > + TEST_ASSERT(stats_clear_pass[0].pages_4k == total_4k_pages, > > > > + "All memory should be mapped at 4k. Expected %ld 4k pages, got %ld", > > > > + total_4k_pages, stats_clear_pass[0].pages_4k); > > > > > > Also assert that huge pages are *not* split when dirty logging is first > > > enabled. > > > > Ah great idea. I felt like I was a little light on the assertions. > > That'll be a good addition. > > > > > > > > > + } else { > > ... > > > > + > > > > + dirty_log_manual_caps = > > > > + kvm_check_cap(KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2); > > > > + dirty_log_manual_caps &= (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | > > > > + KVM_DIRTY_LOG_INITIALLY_SET); > > > > > > Since this is a correctness test I think the test should, by default, > > > test both KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE and 0, to ensure we get > > > test coverage of both. > > > > > > And with that in place, there's probably no need for the -g flag. > > > > Good idea. > > > > > > > > > + > > > > + guest_modes_append_default(); > > ... > > > > + > > > > + if (!is_backing_src_hugetlb(p.backing_src)) { > > > > + pr_info("This test will only work reliably with HugeTLB memory. " > > > > + "It can work with THP, but that is best effort."); > > > > + return KSFT_SKIP; > > > > + } > > > > > > backing_src only controls the memstress data slots. The rest of guest > > > memory could be a source of noise for this test. > > > > That's true, but we compensate for that noise by taking a measurement > > after the population pass. At that point the guest has executed all > > it's code (or at least from all it's code pages) and touched every > > page in the data slot. Since the other slots aren't backed with huge > > pages, NX hugepages shouldn't be an issue. As a result, I would be > > surprised if noise from the other memory became a problem. > > > > > > > > > + > > > > + run_test(&p); > > > > > > Use for_each_guest_mode() to run against all supported guest modes. > > > > I'm not sure that would actually improve coverage. None of the page > > splitting behavior depends on the mode AFAICT. > > You need to use for_each_guest_mode() for the ARM case. The issue is > that whatever mode (guest page size and VA size) you pick might not be > supported by the host. So, you first to explore what's available (via > for_each_guest_mode()). Actually, that's fixed by using the default mode, which picks the first available mode. I would prefer to use for_each_guest_mode() though, who knows and something fails with some specific guest page size for some reason. > > Thanks, > Ricardo > > > > > > > > > > + > > > > + return 0; > > > > +} > > > > -- > > > > 2.39.1.405.gd4c25cc71f-goog > > > >