On Mon, Nov 2, 2020 at 5:12 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > On Mon, Nov 02, 2020 at 03:56:05PM -0800, Ben Gardon wrote: > > On Mon, Nov 2, 2020 at 2:21 PM Peter Xu <peterx@xxxxxxxxxx> wrote: > > > > > > On Tue, Oct 27, 2020 at 04:37:33PM -0700, Ben Gardon wrote: > > > > The dirty log perf test will time verious dirty logging operations > > > > (enabling dirty logging, dirtying memory, getting the dirty log, > > > > clearing the dirty log, and disabling dirty logging) in order to > > > > quantify dirty logging performance. This test can be used to inform > > > > future performance improvements to KVM's dirty logging infrastructure. > > > > > > One thing to mention is that there're a few patches in the kvm dirty ring > > > series that reworked the dirty log test quite a bit (to add similar test for > > > dirty ring). For example: > > > > > > https://lore.kernel.org/kvm/20201023183358.50607-11-peterx@xxxxxxxxxx/ > > > > > > Just a FYI if we're going to use separate test programs. Merging this tests > > > should benefit in many ways, of course (e.g., dirty ring may directly runnable > > > with the perf tests too; so we can manually enable this "perf mode" as a new > > > parameter in dirty_log_test, if possible?), however I don't know how hard - > > > maybe there's some good reason to keep them separate... > > > > Absolutely, we definitely need a performance test for both modes. I'll > > take a look at the patch you linked and see what it would take to > > support dirty ring in this test. > > That would be highly appreciated. > > > Do you think that should be done in this series, or would it make > > sense to add as a follow up? > > To me I slightly lean toward working upon those patches, since we should > potentially share quite some code there (e.g., the clear dirty log cleanup > seems necessary, or not easy to add the dirty ring tests anyway). But current > one is still ok to me at least as initial version - we should always be more > tolerant for test cases, aren't we? :) > > So maybe we can wait for a 3rd opinion before you change the direction. I took a look at your patches for dirty ring and dirty logging modes and thought about this some more. I think your patch to merge the get and clear dirty log tests is great, and I can try to include it and build on it in my series as well if desired. I don't think it would be hard to use the same mode approach in the dirty log perf test. That said, I think it would be easier to keep the functional test (dirty_log_test, clear_dirty_log_test) separate from the performance test because the dirty log validation is extra time and complexity not needed in the dirty log perf test. I did try building them in the same test initially, but it was really ugly. Perhaps a future refactoring could merge them better. > > > > > > > > > [...] > > > > > > > +static void run_test(enum vm_guest_mode mode, unsigned long iterations, > > > > + uint64_t phys_offset, int vcpus, > > > > + uint64_t vcpu_memory_bytes, int wr_fract) > > > > +{ > > > > > > [...] > > > > > > > + /* Start the iterations */ > > > > + iteration = 0; > > > > + host_quit = false; > > > > + > > > > + clock_gettime(CLOCK_MONOTONIC, &start); > > > > + for (vcpu_id = 0; vcpu_id < vcpus; vcpu_id++) { > > > > + pthread_create(&vcpu_threads[vcpu_id], NULL, vcpu_worker, > > > > + &perf_test_args.vcpu_args[vcpu_id]); > > > > + } > > > > + > > > > + /* Allow the vCPU to populate memory */ > > > > + pr_debug("Starting iteration %lu - Populating\n", iteration); > > > > + while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration) > > > > + pr_debug("Waiting for vcpu_last_completed_iteration == %lu\n", > > > > + iteration); > > > > > > Isn't array vcpu_last_completed_iteration[] initialized to all zeros? If so, I > > > feel like this "while" won't run as expected to wait for populating mem. > > > > I think you are totally right. The array should be initialized to -1, > > which I realize isn't a uint and unsigned integer overflow is bad, so > > the array should be converted to ints too. > > I suppose I didn't catch this because it would just make the > > populating pass 0 look really short and pass 1 really long. I remember > > seeing that behavior but not realizing that it was caused by a test > > bug. I will correct this, thank you for pointing that out. > > > > > > > > The flooding pr_debug() seems a bit scary too if the mem size is huge.. How > > > about a pr_debug() after the loop (so if we don't see that it means it hanged)? > > > > I don't think the number of messages on pr_debug will be proportional > > to the size of memory, but rather the product of iterations and vCPUs. > > That said, that's still a lot of messages. > > The guest code dirties all pages, and that process is proportional to the size > of memory, no? > > Btw since you mentioned vcpus - I also feel like above chunk should be put into > the for loop above... Ooof I misread my code. You're totally right. I'll fix that by removing the print there. > > > My assumption was that if you've gone to the trouble to turn on debug > > logging, it's easier to comment log lines out than add them, but I'm > > also happy to just move this to a single message after the loop. > > Yah that's subjective too - feel free to keep whatever you prefer. In all > cases, hopefully I won't even need to enable pr_debug at all. :) > > -- > Peter Xu >