On Thu, Sep 26, 2019 at 2:48 PM Boaz Harrosh <openosd@xxxxxxxxx> wrote: > > On 26/09/2019 10:11, Miklos Szeredi wrote: > > I found a big scheduler scalability bottleneck that is caused by > > update of mm->cpu_bitmap at context switch. This can be worked > > around by using shared memory instead of shared page tables, which is > > a bit of a pain, but it does prove the point. Thought about fixing > > the cpu_bitmap cacheline pingpong, but didn't really get anywhere. > > > > I'm not sure what is the scalability bottleneck you are seeing above. > With zufs I have a very good scalability, almost flat up to the > number of CPUs, and/or the limit of the memory bandwith if I'm accessing > pmem. This was *really* noticable with NUMA and many cpus (>64). > Miklos would you please have some bandwith to review my code? it would > make me very happy and calm. Your input is very valuable to me. Sure, will look at the patches. Thanks, Miklos