Re: [fedora-arm] Re: [Fedocal] Reminder meeting : Fedora ARM & AArch64 status meeting

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/23/19 4:19 AM, Jon Masters wrote:
On 7/19/19 1:09 PM, Jon Masters wrote:
On 7/4/19 6:12 PM, Jon Masters wrote:

I think we have identified the root cause of the 32-bit builder issue.
Many thanks to Paul and Peter for assistance in debugging. Here's my
write-up, and we'll work with the vendor on a suitable mitigation to
workaround any errata:

https://medium.com/@jonmasters_84473/debugging-a-32-bit-fedora-arm-builder-issue-73295d7d673d

The hardware vendor have reproduced what I believe to be an errata.
Meanwhile, I've made a test kernel that forces CONFIG_HIGHPTE to off:

https://koji.fedoraproject.org/koji/taskinfo?taskID=36328838

With this kernel, you still get LPAE but leaf level PTEs are not
allocated from high memory any longer. This is because I believe the
errata to be caused by stage 1 page table walks in the guest trapping to
stage 2 (hypervisor) for e.g. Access bit updates on the host. When those
occur, I believe there is a truncation of the guest IPA (guest memory)
address to 32-bits, but only for page table entry walks. Normal
translation faults I think are unaffected by this (TBC).

Normally, we don't allocate PGDs (high level page table pieces) from
high memory (we allocate those from kernel memory caches) but we DO
allocate PTEs specifically from what might be high memory. Except when
we force CONFIG_HIGHPTE to off. The patch I'm using is attached.

It's currently being tested. If it works, I'm curious for input on
temporarily carrying this in Fedora. In theory it means an LPAE system
could starve for PTEs if it has many many processes running, but in
practice I'm willing to bet LPAE is mostly used by Fedora for the 32-bit
builders and that few people would actually complain if we did this.

This stayed up for 3+ days. Eventually, there were a couple of faults
that I thought were a problem but it turns out that they weren't and
just generated noise on the host kernel log. So it looks good to go with
the hack that I proposed and that's going to be in Fedora's 5.2 kernel.

Detail:

The host saw a couple of exits due to speculative page walks in the
guest. It hit my previous logic due to S1 PTW but this time the HPFAR
was correct vs what we would expect due to the 32-bit range limit.

[359524.820107] JCM: WARNING: Mismatched FIPA and PA translation detected!
[359524.899630] JCM: Hyper faulting far: 0x40163000
[359524.955044] JCM: Guest faulting far: 0xb6dbbf48 (gfn: 0x4016)
[359525.025047] JCM: Guest TTBCR: 0xb5023500, TTBR0: 0x4c99ca80
[359525.092963] JCM: Guest PGD address: 0x4c99ca90
[359525.147312] JCM: Guest PGD: 0x58bf7003
[359525.193319] JCM: Guest PMD address: 0x58bf7db0
[359525.247671] JCM: Guest PMD: 0x40163003
[359525.293678] JCM: Guest PTE address: 0x40163dd8
[359525.348030] JCM: Guest PTE: 0x420000367508fdf
[359525.401338] JCM: Manually translated as: 0xb6dbbf48->0x367508000
[359525.474465] JCM: Faulting IPA page: 0x40163000
[359525.528814] JCM: Faulting PTE page: 0x40163000
[359525.583166] JCM: *** debugging data ***
[359525.630215] JCM: FAR_EL2: 0xb6dbbf48
[359525.674133] JCM: HPFAR_EL2: 0x401630
[359525.718052] JCM: ESR_EL2: 0x8200008b
[359525.761972] JCM: FAR_EL1: 0x4f2e50005b89b4
[359525.812149] JCM: ESR_EL1: 0x20b
[359525.850852] JCM: *** debugging data ***
[359525.897899] JCM: Fault occurred while performing S1 PTW -fixing
[359525.969985] JCM: corrected fault_ipa: 0x40163000
[359526.026423] JCM: Corrected gfn: 0x4016
[359526.072427] JCM: handle access fault
[359526.116347] JCM: ret: 0x1

You can see the FAR reported pfn 4016 and that's what we expected, so
the above was just noise in my test kernel on the host monitoring a bit
too carefully and not needing to actually fix anything this time.

Jon.


Peter applied the workaround on rawhide

https://src.fedoraproject.org/rpms/kernel/c/d7341fee1c2697ae60db6fe23edc60ab55a59668?branch=master

I can see about bringing it into F29/F30/stabilization if Peter doesn't
get to it first.

Thanks,
Laura
_______________________________________________
kernel mailing list -- kernel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to kernel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/kernel@xxxxxxxxxxxxxxxxxxxxxxx




[Index of Archives]     [Fedora General Discussion]     [Older Fedora Users Archive]     [Fedora Advisory Board]     [Fedora Security]     [Fedora Devel Java]     [Fedora Legacy]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Mentors]     [Fedora Package Announce]     [Fedora Package Review]     [Fedora Music]     [Fedora Packaging]     [Centos]     [Fedora SELinux]     [Coolkey]     [Yum Users]     [Tux]     [Yosemite News]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [USB]     [Asterisk PBX]

  Powered by Linux