On 11/23/2017 03:11 AM, Christian König wrote: > Am 22.11.2017 um 18:27 schrieb Boris Ostrovsky: >> On 11/22/2017 11:54 AM, Christian König wrote: >>> Am 22.11.2017 um 17:24 schrieb Boris Ostrovsky: >>>> On 11/22/2017 05:09 AM, Christian König wrote: >>>>> Am 21.11.2017 um 23:26 schrieb Boris Ostrovsky: >>>>>> On 11/21/2017 08:34 AM, Christian König wrote: >>>>>>> Hi Boris, >>>>>>> >>>>>>> attached are two patches. >>>>>>> >>>>>>> The first one is a trivial fix for the infinite loop issue, it now >>>>>>> correctly aborts the fixup when it can't find address space for the >>>>>>> root window. >>>>>>> >>>>>>> The second is a workaround for your board. It simply checks if there >>>>>>> is exactly one Processor Function to apply this fix on. >>>>>>> >>>>>>> Both are based on linus current master branch. Please test if they >>>>>>> fix >>>>>>> your issue. >>>>>> Yes, they do fix it but that's because the feature is disabled. >>>>>> >>>>>> Do you know what the actual problem was (on Xen)? >>>>> I still haven't understood what you actually did with Xen. >>>>> >>>>> When you used PCI pass through with those devices then you have made a >>>>> major configuration error. >>>>> >>>>> When the problem happened on dom0 then the explanation is most likely >>>>> that some PCI device ended up in the configured space, but the routing >>>>> was only setup correctly on one CPU socket. >>>> The problem is that dom0 can be (and was in my case() booted with less >>>> than full physical memory and so the "rest" of the host memory is not >>>> necessarily reflected in iomem. Your patch then tried to configure that >>>> memory for MMIO and the system hang. >>>> >>>> And so my guess is that this patch will break dom0 on a single-socket >>>> system as well. >>> Oh, thanks! >>> >>> I've thought about that possibility before, but wasn't able to find a >>> system which actually does that. >>> >>> May I ask why the rest of the memory isn't reported to the OS? >> That memory doesn't belong to the OS (dom0), it is owned by the >> hypervisor. >> >>> Sounds like I can't trust Linux resource management and probably need >>> to read the DRAM config to figure things out after all. >> >> My question is whether what you are trying to do should ever be done for >> a guest at all (any guest, not necessarily Xen). > > The issue is probably that I don't know enough about Xen: What exactly > is dom0? My understanding was that dom0 is the hypervisor, but that > seems to be incorrect. > > The issue is that under no circumstances *EVER* a virtualized guest > should have access to the PCI devices marked as "Processor Function" on > AMD platforms. Otherwise it is trivial to break out of the virtualization. > > When dom0 is something like the system domain with all hardware access > then the approach seems legitimate, but then the hypervisor should > report the stolen memory to the OS using the e820 table. > > When the hypervisor doesn't do that and the Linux kernel isn't aware > that there is memory at a given location mapping PCI space there will > obviously crash the hypervisor. > > Possible solutions as far as I can see are either disabling this feature > when we detect that we are a Xen dom0, scanning the DRAM settings to > update Linux resource handling or fixing Xen to report stolen memory to > the dom0 OS as reserved. > > Opinions? You are right, these functions are not exposed to a regular guest. I think for dom0 (which is a special Xen guest, with additional privileges) we may be able to add a reserved e820 region for host memory that is not assigned to dom0. Let me try it on Monday (I am out until then). -boris