Hi Bjorn, On 12/8/22 19:57, Bjorn Helgaas wrote: > On Wed, Dec 07, 2022 at 04:31:12PM +0100, Hans de Goede wrote: >> On 12/4/22 10:13, Hans de Goede wrote: >> >> <snip> >> >>>>> 2. I am afraid that now allowing PCI MMIO space to be allocated >>>>> in regions marked as EfiMemoryMappedIO will cause regressions >>>>> on some systems. Specifically when I tried something similar >>>>> the last time I looked at this (using the BIOS date cut-off >>>>> approach IIRC) there was a suspend/resume regression on >>>>> a Lenovo ThinkPad X1 carbon (20A7) model: >>>>> >>>>> https://bugzilla.redhat.com/show_bug.cgi?id=2029207 >>>>> >>>>> Back then I came to the conclusion that the problem is that not >>>>> avoiding the EfiMemoryMappedIO regions caused PCI MMIO space to >>>>> be allocated in the 0xdfa00000 - 0xdfa10000 range which is >>>>> listed in the EFI memmap as: >>>>> >>>>> [ 0.000000] efi: mem46: [MMIO |RUN| | | | | | | | | | | | | ] range=[0x00000000dfa00000-0x00000000dfa0ffff] (0MB) >>>>> >>>>> And with current kernels with the extra logging added for this >>>>> the following is logged related to this: >>>>> >>>>> [ 0.326504] acpi PNP0A08:00: clipped [mem 0xdfa00000-0xfebfffff window] to [mem 0xdfa10000-0xfebfffff window] for e820 entry [mem 0xdceff000-0xdfa0ffff] >>>>> >>>>> I believe patch 1/4 of this set will make this clipping go away, >>>>> re-introducing the suspend/resume problem. >>>> >>>> Yes, I'm afraid you're right. Comparing the logs at comment #31 >>>> (fails) and comment #38 (works): >>>> >>>> pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window] >>>> pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfa00000-0xdfbfffff] fails >>>> pci 0000:00:1c.0: BAR 14: assigned [mem 0xdfb00000-0xdfcfffff] works >>>> >>>> Since 0xdfa00000 is included in the host bridge _CRS, but isn't >>>> usable, my guess is this is a _CRS bug. >>> >>> Ack. >>> >>> So I was thinking to maybe limit the removal of EfiMemoryMappedIO >>> regions from the E820 map if they are big enough to cause troubles? >>> >>> Looking at the EFI map MMIO regions on this Lenovo ThinkPad X1 carbon >>> (20A7) model, they are tiny. Where as the ones which we know cause >>> problems are huge. So maybe add a bit of heuristics to patch 1/4 based >>> on the EfiMemoryMappedIO region size and only remove the big ones >>> from the E820 map ? >>> >>> I know that adding heuristics like this always feels a bit wrong, >>> because you end up putting a somewhat arbitrary cut off point in >>> the code on which to toggle behavior on/off, but I think that in >>> this case it should work nicely given how huge the EfiMemoryMappedIO >>> regions which are actually causing problems are. > > I'll post a v2 that removes only regions 256KB or larger in a minute. Ok, may I ask why 256KB? I see that that rules out then troublesome MMIO regions from the X1 carbon from: https://bugzilla.redhat.com/show_bug.cgi?id=2029207 : efi: mem46: [MMIO|RUN| ] range=[0xdfa00000-0xdfa0ffff] (0MB) [64K] which we know we need to avoid / keep reserved. But OTOH the reservations which are causing the problems with assigning resources to PCI devices by Linux look like this: efi: mem50: [MMIO |RUN| | | | | | | | | | | | |UC] range=[0x0000000065400000-0x00000000cfffffff] (1708MB) which is significantly larger then 256KB. So we could e.g. also put the cut-off point at 16MB and still remove the above troublesome reservation from the E820 table. Note just thinking out loud here. I have no idea if 16MB would be better... > >> Looking at the efi=debug output from: >> >> https://bugzilla-attachments.redhat.com/attachment.cgi?id=1861035 >> >> The small MMIO regions which we most honor as reserved do >> have the "RUN" (runtime) flag set in the EFI mmap. > > Just trying to follow along here, so not sure any of the following is > relevant ... > > This attachment is from > https://bugzilla.redhat.com/show_bug.cgi?id=2029207, and it shows: > > efi: mem46: [MMIO|RUN| ] range=[0xdfa00000-0xdfa0ffff] (0MB) [64K] > efi: mem47: [MMIO|RUN|UC] range=[0xf80f8000-0xf80f8fff] (0MB) [4K] > pci_bus 0000:00: root bus resource [mem 0xdfa00000-0xfebfffff window] > pci_bus 0000:00: root bus resource [mem 0xfed40000-0xfed4bfff window] > > mem46 is included in the PNP0A08 _CRS, and Ivan has verified > experimentally that we have to avoid it. Ack. > mem47 is also included in the _CRS, but I don't have a clue what it > is. Maybe some hidden device used by BIOS but not visible to us? Could be, there is at least one hidden device called the P2SB on most Intel systems. >> But I'm afraid that the same applies to the troublesome >> MMIO EFI regions which cause the failures to assign >> PCI regions for devices not setup by the firmware: >> >> https://bugzilla-attachments.redhat.com/attachment.cgi?id=1861407 >> >> So that "RUN" flag is of no use. > > I don't know what bug this attachment is from. It is from https://bugzilla.redhat.com/show_bug.cgi?id=1868899 which is the ideapad slim 3 with the touchpad issue caused by the: efi: mem50: [MMIO |RUN| | | | | | | | | | | | |UC] range=[0x0000000065400000-0x00000000cfffffff] (1708MB) reservation getting in the way of assigning resources to the i2c-controller. > Is the point here that you considered doing the E820 removal based on > the EFI_MEMORY_RUNTIME memory *attribute* instead of the > EFI_MEMORY_MAPPED_IO memory *type*? > > I don't really know the details of EFI_MEMORY_MAPPED_IO vs > EFI_MEMORY_RUNTIME, but it looks like EFI_MEMORY_RUNTIME can be > applied to things like EFI_RUNTIME_SERVICES_CODE (not MMIO space) that > should stay in E820. Sorry for the confusion. What I was trying to say is that I was interested in seeing if we could use the "RUN" flag to differentiate between: 1. The big MMIO region which we want to remove from the e820 map: efi: mem50: [MMIO |RUN| | | | | | | | | | | | |UC] range=[0x0000000065400000-0x00000000cfffffff] (1708MB) 2. The small MMIO region which we want to keep to avoid the reported suspend/resume issue: efi: mem46: [MMIO|RUN| ] range=[0xdfa00000-0xdfa0ffff] (0MB) [64K] But unfortunately both have the RUN flag set so the RUN flag is of no use to us. Regards, Hans