On Wed, 23 Jun 2021 at 18:46, Sachin Sant <sachinp@xxxxxxxxxxxxxxxxxx> wrote: > > > > Ok. This becomes even more weird. Could you share your config file and more details about > > you setup ? > > > > Have you applied the patch below ? > > https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@xxxxxxxxxx/ > > > > Regarding the load_avg warning, I can see possible problem during attach. Could you add > > the patch below. The load_avg warning seems to happen during boot and sched_entity > > creation. > > > > Here is a summary of my testing. > > I have a POWER box with PowerVM hypervisor. On this box I have a logical partition(LPAR) or guest > (allocated with 32 cpus 90G memory) running linux-next. > > I started with a clean slate. > Moved to linux-next 5.13.0-rc7-next-20210622 as base code. > Applied patch #1 from Vincent which contains changes to dequeue_load_avg() > Applied patch #2 from Vincent which contains changes to enqueue_load_avg() > Applied patch #3 from Vincent which contains changes to attach_entity_load_avg() > Applied patch #4 from https://lore.kernel.org/lkml/20210621174330.11258-1-vincent.guittot@xxxxxxxxxx/ > > With these changes applied I was still able to recreate the issue. I could see kernel warning > during boot. > > I then applied patch #5 from Odin which contains changes to update_cfs_rq_load_avg() > > With all the 5 patches applied I was able to boot the kernel without any warning messages. > I also ran scheduler related tests from ltp (./runltp -f sched) . All tests including cfs_bandwidth01 > ran successfully. No kernel warnings were observed. ok so Odin's patch fixes the problem which highlights that we overestimate _sum or don't sync _avg and _sum correctly I'm going to look at this further > > Have also attached .config in case it is useful. config has CONFIG_HZ_100=y Thanks, i will have a look > > Thanks > -Sachin >