On 17/01/2017 03:18, Li, Liang Z wrote: >> On 29/12/2016 10:25, Liang Li wrote: >>> x86-64 is currently limited physical address width to 46 bits, which >>> can support 64 TiB of memory. Some vendors require to support more for >>> some use case. Intel plans to extend the physical address width to >>> 52 bits in some of the future products. >>> >>> The current EPT implementation only supports 4 level page table, which >>> can support maximum 48 bits physical address width, so it's needed to >>> extend the EPT to 5 level to support 52 bits physical address width. >>> >>> This patchset has been tested in the SIMICS environment for 5 level >>> paging guest, which was patched with Kirill's patchset for enabling >>> 5 level page table, with both the EPT and shadow page support. I just >>> covered the booting process, the guest can boot successfully. >>> >>> Some parts of this patchset can be improved. Any comments on the >>> design or the patches would be appreciated. >> >> I will review the patches. They seem fairly straightforward. >> >> However, I am worried about the design of the 5-level page table feature >> with respect to migration. >> >> Processors that support the new LA57 mode can write 57-canonical/48- >> noncanonical linear addresses to some registers even when LA57 mode is >> inactive. This is true even of unprivileged instructions, in particular >> WRFSBASE/WRGSBASE. >> >> This is fairly bad because, if a guest performs such a write (because of a bug >> or because of malice), it will not be possible to migrate the virtual machine to >> a machine that lacks LA57 mode. >> >> Ordinarily, hypervisors trap CPUID to hide features that are only present in >> some processors of a heterogeneous cluster, and the hypervisor also traps >> for example CR4 writes to prevent enabling features that were masked away. >> In this case, however, the only way for the hypervisor to prevent the write >> would be to run the guest with >> CR4.FSGSBASE=0 and trap all executions of WRFSBASE/WRGSBASE. This >> might have negative effects on performance for workloads that use the >> instructions. >> >> Of course, this is a problem even without your patches. However, I think it >> should be addressed first. I am seriously thinking of blacklisting FSGSBASE >> completely on LA57 machines until the above is fixed in hardware. >> >> Paolo > > The issue has already been forwarded to the hardware guys, still waiting for the feedback. Going to review this now. Any news? Paolo