> On Dec 8, 2020, at 2:00 AM, Claudio Imbrenda <imbrenda@xxxxxxxxxxxxx> wrote: > > On Tue, 8 Dec 2020 01:23:59 -0800 > Nadav Amit <nadav.amit@xxxxxxxxx> wrote: > >>> On Dec 8, 2020, at 1:15 AM, Claudio Imbrenda >>> <imbrenda@xxxxxxxxxxxxx> wrote: >>> >>> On Mon, 7 Dec 2020 17:10:13 -0800 >>> Nadav Amit <nadav.amit@xxxxxxxxx> wrote: >>> >>>>> On Dec 7, 2020, at 4:41 PM, Nadav Amit <nadav.amit@xxxxxxxxx> >>>>> wrote: >>>>>> On Oct 2, 2020, at 8:44 AM, Claudio Imbrenda >>>>>> <imbrenda@xxxxxxxxxxxxx> wrote: >>>>>> >>>>>> This is a complete rewrite of the page allocator. >>>>> >>>>> This patch causes me crashes: >>>>> >>>>> lib/alloc_page.c:433: assert failed: !(areas_mask & BIT(n)) >>>>> >>>>> It appears that two areas are registered on AREA_LOW_NUMBER, as >>>>> setup_vm() can call (and calls on my system) >>>>> page_alloc_init_area() twice. >>>>> >>>>> setup_vm() uses AREA_ANY_NUMBER as the area number argument but >>>>> eventually this means, according to the code, that >>>>> __page_alloc_init_area() would use AREA_LOW_NUMBER. >>>>> >>>>> I do not understand the rationale behind these areas well enough >>>>> to fix it. >>>> >>>> One more thing: I changed the previous allocator to zero any >>>> allocated page. Without it, I get strange failures when I do not >>>> run the tests on KVM, which are presumably caused by some >>>> intentional or unintentional hidden assumption of kvm-unit-tests >>>> that the memory is zeroed. >>>> >>>> Can you restore this behavior? I can also send this one-line fix, >>>> but I do not want to overstep on your (hopeful) fix for the >>>> previous problem that I mentioned (AREA_ANY_NUMBER). >>> >>> no. Some tests depend on the fact that the memory is being touched >>> for the first time. >>> >>> if your test depends on memory being zeroed on allocation, maybe you >>> can zero the memory yourself in the test? >>> >>> otherwise I can try adding a function to explicitly allocate a >>> zeroed page. >> >> To be fair, I do not know which non-zeroed memory causes the failure, >> and debugging these kind of failures is hard and sometimes >> non-deterministic. For instance, the failure I got this time was: >> >> Test suite: vmenter >> VM-Fail on vmlaunch: error number is 7. See Intel 30.4. >> >> And other VM-entry failures, which are not easy to debug, especially >> on bare-metal. > > so you are running the test on bare metal? > > that is something I had not tested Base-metal / VMware. > >> Note that the failing test is not new, and unfortunately these kind of >> errors (wrong assumption that memory is zeroed) are not rare, since >> KVM indeed zeroes the memory (unlike other hypervisors and >> bare-metal). >> >> The previous allocator had the behavior of zeroing the memory to > > I don't remember such behaviour, but I'll have a look See https://www.spinics.net/lists/kvm/msg186474.html > >> avoid such problems. I would argue that zeroing should be the default >> behavior, and if someone wants to have the memory “untouched” for a >> specific test (which one?) he should use an alternative function for >> this matter. > > probably we need some commandline switches to change the behaviour of > the allocator according to the specific needs of each testcase > > > I'll see what I can do I do not think commandline switches are the right way. I think that reproducibility requires the memory to always be on a given state before the tests begin. There are latent bugs in kvm-unit-tests that are not apparent when the memory is zeroed. I do not think anyone wants to waste time on resolving these bugs.