On Tue, Nov 22, 2022 at 10:06:06PM -0700, Song Liu wrote: > On Tue, Nov 22, 2022 at 5:21 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > On Mon, Nov 21, 2022 at 07:28:36PM -0700, Song Liu wrote: > > > On Mon, Nov 21, 2022 at 1:12 PM Luis Chamberlain <mcgrof@xxxxxxxxxx> wrote: > > > > > [...] > > > fixes a bug that splits the page table (from 2MB to 4kB) for the WHOLE kernel > > > text. The bug stayed in the kernel for almost a year. None of all the available > > > open source benchmark had caught it before this specific benchmark. > > > > That doesn't mean enterpise level testing would not have caught it, and > > enteprise kernels run on ancient kernels so they would not catch up that > > fast. RHEL uses even more ancient kernels than SUSE so let's consider > > where SUSE was during this regression. The commit you mentioned the fix > > 7af0145067bc went upstream on v5.3-rc7~4^2, and that was in August 2019. > > The bug was introduced through commit 585948f4f695 ("x86/mm/cpa: Avoid > > the 4k pages check completely") and that was on v4.20-rc1~159^2~41 > > around September 2018. Around September 2018, the time the regression was > > committed, the most bleeding edge Enterprise Linux kernel in the industry was > > that on SLE15 and so v4.12 and so there is no way in hell the performance > > team at SUSE for instance would have even come close to evaluating code with > > that regression. In fact, they wouldn't come accross it in testing until > > SLE15-SP2 on the v5.3 kernel but by then the regression would have been fixed. > > Can you refer me to one enterprise performance report with open source > benchmark that shows ~1% performance regression? If it is available, I am > more than happy to try it out. Note that, we need some BPF programs to show > the benefit of this set. In most production hosts, network related BPF programs > are the busiest. Therefore, single host benchmarks will not show the benefit. > > Thanks, > Song > > PS: Data in [1] if full of noise: > > """ > 2. For each benchmark/system combination, the 1G mapping had the highest > performance for 45% of the tests, 2M for ~30%, and 4k for~20%. > > 3. From the average delta, among 1G/2M/4K, 4K gets the lowest > performance in all the 4 test machines, while 1G gets the best > performance on 2 test machines and 2M gets the best performance on the > other 2 machines. > """ I don't think it's noise. IMO, this means that performance degradation caused by the fragmentation of the direct map highly depends on workload and microarchitecture. > There is no way we can get consistent result of 1% performance improvement > from experiments like those. Experiments like those show how a change in the kernel behaviour affects different workloads and not a single benchmark. Having a performance improvement in a single benchmark does necessarily not mean other benchmarks won't regress. > [1] https://lore.kernel.org/linux-mm/213b4567-46ce-f116-9cdf-bbd0c884eb3c@xxxxxxxxxxxxxxx/ -- Sincerely yours, Mike.