Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2024-09-23 at 19:45 -0700, Ricardo Neri wrote:
> On Thu, Sep 19, 2024 at 01:19:27PM +0200,
> gregkh@xxxxxxxxxxxxxxxxxxx wrote:
> > On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote:
> > > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote:
> > > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis
> > > > wrote:
> > > > > [CCing the x86 folks, Greg, and the regressions list]
> > > > > 
> > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > 
> > > > > On 30.07.24 18:41, Thomas Lindroth wrote:
> > > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my
> > > > > > machines and
> > > > > > noticed that
> > > > > > the dmesg line "Incomplete global flushes, disabling PCID"
> > > > > > had
> > > > > > disappeared from
> > > > > > the log.
> > > > > 
> > > > > Thomas, thx for the report. FWIW, mainline developers like
> > > > > the x86
> > > > > folks
> > > > > or Tony are free to focus on mainline and leave
> > > > > stable/longterm
> > > > > series
> > > > > to other people -- some nevertheless help out regularly or
> > > > > occasionally.
> > > > > So with a bit of luck this mail will make one of them care
> > > > > enough
> > > > > to
> > > > > provide a 6.1 version of what you afaics called the "existing
> > > > > fix"
> > > > > in
> > > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU
> > > > > model
> > > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But
> > > > > if
> > > > > not I
> > > > > suspect it might be up to you to prepare and submit a 6.1.y
> > > > > variant
> > > > > of
> > > > > that fix, as you seem to care and are able to test the patch.
> > > > 
> > > > Needs to go to 6.6.y first, right?  But even then, it does not
> > > > apply
> > > > to
> > > > 6.1.y cleanly, so someone needs to send a backported (and
> > > > tested)
> > > > series
> > > > to us at stable@xxxxxxxxxxxxxxx and we will be glad to queue
> > > > them up
> > > > then.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > There are three commits involved.
> > > 
> > > commit A:
> > >    4db64279bc2b (""x86/cpu: Switch to new Intel CPU model
> > > defines"") 
> > >    This commit replaces
> > >       X86_MATCH_INTEL_FAM6_MODEL(ANY, 1),             /* SNC */
> > >    with
> > >       X86_MATCH_VFM(INTEL_ANY,         1),    /* SNC */
> > >    This is a functional change because the family info is
> > > replaced with
> > > 0. And this exposes a x86_match_cpu() problem that it breaks when
> > > the
> > > vendor/family/model/stepping/feature fields are all zeros.
> > > 
> > > commit B:
> > >    93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just
> > > X86_VENDOR_INTEL")
> > >    It addresses the x86_match_cpu() problem by introducing a
> > > valid flag
> > > and set the flag in the Intel CPU model defines.
> > >    This fixes commit A, but it actually breaks the x86_cpu_id
> > > structures that are constructed without using the Intel CPU model
> > > defines, like arch/x86/mm/init.c.
> > > 
> > > commit C:
> > >    2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")
> > >    arch/x86/mm/init.c: broke by commit B but fixed by using the
> > > new
> > > Intel CPU model defines
> > > 
> > > In 6.1.99,
> > > commit A is missing
> > > commit B is there
> > > commit C is missing
> > > 
> > > In 6.6.50,
> > > commit A is missing
> > > commit B is there
> > > commit C is missing
> > > 
> > > Now we can fix the problem in stable kernel, by converting
> > > arch/x86/mm/init.c to use the CPU model defines (even the old
> > > style
> > > ones). But before that, I'm wondering if we need to backport
> > > commit B
> > > in 6.1 and 6.6 stable kernel because only commit A can expose
> > > this
> > > problem.
> > 
> > If so, can you submit the needed backports for us to apply?  That's
> > the
> > easiest way for us to take them, thanks.
> 
> I audited all the uses of x86_match_cpu(match). All callers that
> construct
> the `match` argument using the family of X86_MATCH_* macros from
> arch/x86/
> include/asm/cpu_device_id.h function correctly because the commit B
> has
> been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu:
> Fix
> x86_match_cpu() to match just X86_VENDOR_INTEL").
> 
> Only those callers that use their own thing to compose the `match`
> argument
> are buggy:
>     * arch/x86/mm/init.c
>     * drivers/powercap/intel_rapl_msr.c (only in 6.1.99)

Thanks for auditing this. I overlooked the intel_rapl driver case.
> 
> Summarizing, v6.1.99 needs these two commits from mainline
>     * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for
>       pl4_supported field")
>     * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")
> 
> v6.6.50 only needs the second commit.

Well, commit B 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match
just X86_VENDOR_INTEL") is backported to all stable kernels. And the
above two broken cases are also there.

So I suppose we need to backport all of them to 5.x stable kernel as
well.

thanks,
rui
> 
> I will submit these backports.
> 
> Thanks and BR,
> Ricardo





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux