On 11/29/18 4:06 PM, Kirill A. Shutemov wrote: > On Thu, Nov 29, 2018 at 03:00:45PM +0000, Juergen Gross wrote: >> On 29/11/2018 15:32, Kirill A. Shutemov wrote: >>> On Thu, Nov 29, 2018 at 02:24:47PM +0000, Kirill A. Shutemov wrote: >>>> On Thu, Nov 29, 2018 at 01:35:17PM +0000, Juergen Gross wrote: >>>>> On 29/11/2018 14:26, Kirill A. Shutemov wrote: >>>>>> On Thu, Nov 29, 2018 at 09:41:25AM +0000, Juergen Gross wrote: >>>>>>> On 29/11/2018 02:22, Hans van Kranenburg wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> As also seen at: >>>>>>>> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914951 >>>>>>>> >>>>>>>> Attached there are two serial console output logs. One is starting with >>>>>>>> Xen 4.11 (from debian unstable) as dom0, and the other one without Xen. >>>>>>>> >>>>>>>> [ 2.085543] BUG: unable to handle kernel paging request at >>>>>>>> ffff888d9fffc000 >>>>>>>> [ 2.085610] PGD 200c067 P4D 200c067 PUD 0 >>>>>>>> [ 2.085674] Oops: 0000 [#1] SMP NOPTI >>>>>>>> [ 2.085736] CPU: 1 PID: 1 Comm: swapper/0 Not tainted >>>>>>>> 4.19.0-trunk-amd64 #1 Debian 4.19.5-1~exp1+pvh1 >>>>>>>> [ 2.085823] Hardware name: HP ProLiant DL360 G7, BIOS P68 05/21/2018 >>>>>>>> [ 2.085895] RIP: e030:ptdump_walk_pgd_level_core+0x1fd/0x490 >>>>>>>> [...] >>>>>>> >>>>>>> The offending stable commit is 4074ca7d8a1832921c865d250bbd08f3441b3657 >>>>>>> ("x86/mm: Move LDT remap out of KASLR region on 5-level paging"), this >>>>>>> is commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream. >>>>>>> >>>>>>> Current upstream kernel is booting fine under Xen, so in general the >>>>>>> patch should be fine. Using an upstream kernel built from above commit >>>>>>> (with the then needed Xen fixup patch 1457d8cf7664f34c4ba534) is fine, >>>>>>> too. >>>>>>> >>>>>>> Kirill, are you aware of any prerequisite patch from 4.20 which could be >>>>>>> missing in 4.19.5? >>>>>> >>>>>> I'm not. >>>>>> >>>>>> Let me look into this. >>>>>> >>>>> >>>>> What is making me suspicious is the failure happening just after >>>>> releasing the init memory. Maybe there is an access to .init.data >>>>> segment or similar? The native kernel booting could be related to the >>>>> usage of 2M mappings not being available in a PV-domain. >>>> >>>> Ahh.. Could you test this: >>>> >>>> diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c >>>> index a12afff146d1..7dec63ec7aab 100644 >>>> --- a/arch/x86/mm/dump_pagetables.c >>>> +++ b/arch/x86/mm/dump_pagetables.c >>>> @@ -496,7 +496,7 @@ static inline bool is_hypervisor_range(int idx) >>>> * ffff800000000000 - ffff87ffffffffff is reserved for >>>> * the hypervisor. >>>> */ >>>> - return (idx >= pgd_index(__PAGE_OFFSET) - 16) && >>>> + return (idx >= pgd_index(__PAGE_OFFSET) - 17) && >>>> (idx < pgd_index(__PAGE_OFFSET)); >>>> #else >>>> return false; >>> >>> Or, better, this: >> >> That makes it boot again! >> >> Any idea why upstream doesn't need it? > > Nope. > > I'll prepare a proper fix. > Thanks for looking into this. In the meantime, I applied the "Or, better, this" change, and my dom0 boots again. FYI, boot log now: (paste 90d valid) https://paste.debian.net/plainh/48940826 Hans