Re: [Xen-devel] Linux 4.19.5 fails to boot as Xen dom0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 30, 2018 at 02:53:50PM +0000, Hans van Kranenburg wrote:
> On 11/30/18 2:26 PM, Kirill A. Shutemov wrote:
> > On Fri, Nov 30, 2018 at 01:11:56PM +0000, Hans van Kranenburg wrote:
> >> On 11/29/18 4:06 PM, Kirill A. Shutemov wrote:
> >>> On Thu, Nov 29, 2018 at 03:00:45PM +0000, Juergen Gross wrote:
> >>>> On 29/11/2018 15:32, Kirill A. Shutemov wrote:
> >>>>> On Thu, Nov 29, 2018 at 02:24:47PM +0000, Kirill A. Shutemov wrote:
> >>>>>> On Thu, Nov 29, 2018 at 01:35:17PM +0000, Juergen Gross wrote:
> >>>>>>> On 29/11/2018 14:26, Kirill A. Shutemov wrote:
> >>>>>>>> On Thu, Nov 29, 2018 at 09:41:25AM +0000, Juergen Gross wrote:
> >>>>>>>>> On 29/11/2018 02:22, Hans van Kranenburg wrote:
> >>>>>>>>>> Hi,
> >>>>>>>>>>
> >>>>>>>>>> As also seen at:
> >>>>>>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951
> >>>>>>>>>>
> >>>>>>>>>> Attached there are two serial console output logs. One is starting with
> >>>>>>>>>> Xen 4.11 (from debian unstable) as dom0, and the other one without Xen.
> >>>>>>>>>>
> >>>>>>>>>> [    2.085543] BUG: unable to handle kernel paging request at
> >>>>>>>>>> ffff888d9fffc000
> >>>>>>>>>> [    2.085610] PGD 200c067 P4D 200c067 PUD 0
> >>>>>>>>>> [    2.085674] Oops: 0000 [#1] SMP NOPTI
> >>>>>>>>>> [    2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted
> >>>>>>>>>> 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1
> >>>>>>>>>> [    2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 05/21/2018
> >>>>>>>>>> [    2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490
> >>>>>>>>>> [...]
> >>>>>>>>>
> >>>>>>>>> The offending stable commit is 4074ca7d8a1832921c865d250bbd08f3441b3657
> >>>>>>>>> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this
> >>>>>>>>> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream.
> >>>>>>>>>
> >>>>>>>>> Current upstream kernel is booting fine under Xen, so in general the
> >>>>>>>>> patch should be fine. Using an upstream kernel built from above commit
> >>>>>>>>> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine,
> >>>>>>>>> too.
> >>>>>>>>>
> >>>>>>>>> Kirill, are you aware of any prerequisite patch from 4.20 which could be
> >>>>>>>>> missing in 4.19.5?
> >>>>>>>>
> >>>>>>>> I'm not.
> >>>>>>>>
> >>>>>>>> Let me look into this.
> >>>>>>>>
> >>>>>>>
> >>>>>>> What is making me suspicious is the failure happening just after
> >>>>>>> releasing the init memory. Maybe there is an access to .init.data
> >>>>>>> segment or similar? The native kernel booting could be related to the
> >>>>>>> usage of 2M mappings not being available in a PV-domain.
> >>>>>>
> >>>>>> Ahh.. Could you test this:
> >>>>>>
> >>>>>> diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
> >>>>>> index a12afff146d1..7dec63ec7aab 100644
> >>>>>> --- a/arch/x86/mm/dump_pagetables.c
> >>>>>> +++ b/arch/x86/mm/dump_pagetables.c
> >>>>>> @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx)
> >>>>>>  	 * ffff800000000000 - ffff87ffffffffff is reserved for
> >>>>>>  	 * the hypervisor.
> >>>>>>  	 */
> >>>>>> -	return	(idx >= pgd_index(__PAGE_OFFSET) - 16) &&
> >>>>>> +	return	(idx >= pgd_index(__PAGE_OFFSET) - 17) &&
> >>>>>>  		(idx <  pgd_index(__PAGE_OFFSET));
> >>>>>>  #else
> >>>>>>  	return false;
> >>>>>
> >>>>> Or, better, this:
> >>>>
> >>>> That makes it boot again!
> >>>>
> >>>> Any idea why upstream doesn't need it?
> >>>
> >>> Nope.
> >>>
> >>> I'll prepare a proper fix.
> >>>
> >>
> >> Thanks for looking into this.
> >>
> >> In the meantime, I applied the "Or, better, this" change, and my dom0
> >> boots again.
> >>
> >> FYI, boot log now: (paste 90d valid)
> >> https://paste.debian.net/plainh/48940826
> > 
> > I forgot to CC you:
> > 
> > https://lkml.kernel.org/r/20181130121131.g3xvlvixv7mvlr7b@xxxxxxxxxxxxxxxxxx
> > 
> > Please give it a try.
> 
> I'm not in that thread, so my response here...
> 
> You paste a v2-like patch into 'Re: [PATCH 1/2]'. Juergen says:
> s/LDT_PGD_ENTRY/GUARD_HOLE_PGD_ENTRY/, then you say Ughh.., change it to
> GUARD_HOLE_ENTRY, which does not exist, and then get a Reviewed-by from
> Juergen.
> 
> I guess it has to be GUARD_HOLE_PGD_ENTRY after all...
> 
> arch/x86/include/asm/pgtable_64_types.h:116:31: error:
> 'GUARD_HOLE_ENTRY' undeclared (first use in this function); did you mean
> 'GUARD_HOLE_PGD_ENTRY'?
> 
> I'll test that instead.

Yes, thank you. It was a long week... :/

Let me know if it works. I'll repost the fixed version with your
Tested-by.

-- 
 Kirill A. Shutemov



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux