Re: [PATCH bpf-next v4 0/6] execmem_alloc for BPF programs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 22, 2022 at 10:06:06PM -0700, Song Liu wrote:
> On Tue, Nov 22, 2022 at 5:21 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> >
> > On Mon, Nov 21, 2022 at 07:28:36PM -0700, Song Liu wrote:
> > > On Mon, Nov 21, 2022 at 1:12 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote:
> > > >
> [...]
> > > fixes a bug that splits the page table (from 2MB to 4kB) for the WHOLE kernel
> > > text. The bug stayed in the kernel for almost a year. None of all the available
> > > open source benchmark had caught it before this specific benchmark.
> >
> > That doesn't mean enterpise level testing would not have caught it, and
> > enteprise kernels run on ancient kernels so they would not catch up that
> > fast. RHEL uses even more ancient kernels than SUSE so let's consider
> > where SUSE was during this regression. The commit you mentioned the fix
> > 7af0145067bc went upstream on v5.3-rc7~4^2, and that was in August 2019.
> > The bug was introduced through commit 585948f4f695 ("x86/mm/cpa: Avoid
> > the 4k pages check completely") and that was on v4.20-rc1~159^2~41
> > around September 2018. Around September 2018, the time the regression was
> > committed, the most bleeding edge Enterprise Linux kernel in the industry was
> > that on SLE15 and so v4.12 and so there is no way in hell the performance
> > team at SUSE for instance would have even come close to evaluating code with
> > that regression. In fact, they wouldn't come accross it in testing until
> > SLE15-SP2 on the v5.3 kernel but by then the regression would have been fixed.
> 
> Can you refer me to one enterprise performance report with open source
> benchmark that shows ~1% performance regression? If it is available, I am
> more than happy to try it out. Note that, we need some BPF programs to show
> the benefit of this set. In most production hosts, network related BPF programs
> are the busiest. Therefore, single host benchmarks will not show the benefit.
> 
> Thanks,
> Song
> 
> PS: Data in [1] if full of noise:
> 
> """
> 2. For each benchmark/system combination, the 1G mapping had the highest
> performance for 45% of the tests, 2M for ~30%, and 4k for~20%.
> 
> 3. From the average delta, among 1G/2M/4K, 4K gets the lowest
> performance in all the 4 test machines, while 1G gets the best
> performance on 2 test machines and 2M gets the best performance on the
> other 2 machines.
> """

I don't think it's noise. IMO, this means that performance degradation
caused by the fragmentation of the direct map highly depends on workload
and microarchitecture.
 
> There is no way we can get consistent result of 1% performance improvement
> from experiments like those.

Experiments like those show how a change in the kernel behaviour affects
different workloads and not a single benchmark. Having a performance
improvement in a single benchmark does necessarily not mean other
benchmarks won't regress.
 
> [1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/

-- 
Sincerely yours,
Mike.



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux