On Tue, Jun 30, 2020 at 5:31 PM Al Stone <ahs3@xxxxxxxxxx> wrote: > > On 30 Jun 2020 13:44, Rafael J. Wysocki wrote: > > On Mon, Jun 29, 2020 at 10:57 PM Al Stone <ahs3@xxxxxxxxxx> wrote: > > > > > > On 29 Jun 2020 18:33, Rafael J. Wysocki wrote: > > > > From: "Rafael J. Wysocki" <rafael.j.wysocki@xxxxxxxxx> > > > > > > > > The ACPICA's strategy with respect to the handling of memory mappings > > > > associated with memory operation regions is to avoid mapping the > > > > entire region at once which may be problematic at least in principle > > > > (for example, it may lead to conflicts with overlapping mappings > > > > having different attributes created by drivers). It may also be > > > > wasteful, because memory opregions on some systems take up vast > > > > chunks of address space while the fields in those regions actually > > > > accessed by AML are sparsely distributed. > > > > > > > > For this reason, a one-page "window" is mapped for a given opregion > > > > on the first memory access through it and if that "window" does not > > > > cover an address range accessed through that opregion subsequently, > > > > it is unmapped and a new "window" is mapped to replace it. Next, > > > > if the new "window" is not sufficient to acess memory through the > > > > opregion in question in the future, it will be replaced with yet > > > > another "window" and so on. That may lead to a suboptimal sequence > > > > of memory mapping and unmapping operations, for example if two fields > > > > in one opregion separated from each other by a sufficiently wide > > > > chunk of unused address space are accessed in an alternating pattern. > > > > > > > > The situation may still be suboptimal if the deferred unmapping > > > > introduced previously is supported by the OS layer. For instance, > > > > the alternating memory access pattern mentioned above may produce > > > > a relatively long list of mappings to release with substantial > > > > duplication among the entries in it, which could be avoided if > > > > acpi_ex_system_memory_space_handler() did not release the mapping > > > > used by it previously as soon as the current access was not covered > > > > by it. > > > > > > > > In order to improve that, modify acpi_ex_system_memory_space_handler() > > > > to preserve all of the memory mappings created by it until the memory > > > > regions associated with them go away. > > > > > > > > Accordingly, update acpi_ev_system_memory_region_setup() to unmap all > > > > memory associated with memory opregions that go away. > > > > > > > > Reported-by: Dan Williams <dan.j.williams@xxxxxxxxx> > > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@xxxxxxxxx> > > > > --- > > > > drivers/acpi/acpica/evrgnini.c | 14 ++++---- > > > > drivers/acpi/acpica/exregion.c | 65 ++++++++++++++++++++++++---------- > > > > include/acpi/actypes.h | 12 +++++-- > > > > 3 files changed, 64 insertions(+), 27 deletions(-) > > > > > > > > diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c > > > > index aefc0145e583..89be3ccdad53 100644 > > > > --- a/drivers/acpi/acpica/evrgnini.c > > > > +++ b/drivers/acpi/acpica/evrgnini.c > > > > @@ -38,6 +38,7 @@ acpi_ev_system_memory_region_setup(acpi_handle handle, > > > > union acpi_operand_object *region_desc = > > > > (union acpi_operand_object *)handle; > > > > struct acpi_mem_space_context *local_region_context; > > > > + struct acpi_mem_mapping *mm; > > > > > > > > ACPI_FUNCTION_TRACE(ev_system_memory_region_setup); > > > > > > > > @@ -46,13 +47,14 @@ acpi_ev_system_memory_region_setup(acpi_handle handle, > > > > local_region_context = > > > > (struct acpi_mem_space_context *)*region_context; > > > > > > > > - /* Delete a cached mapping if present */ > > > > + /* Delete memory mappings if present */ > > > > > > > > - if (local_region_context->mapped_length) { > > > > - acpi_os_unmap_memory(local_region_context-> > > > > - mapped_logical_address, > > > > - local_region_context-> > > > > - mapped_length); > > > > + while (local_region_context->first_mm) { > > > > + mm = local_region_context->first_mm; > > > > + local_region_context->first_mm = mm->next_mm; > > > > + acpi_os_unmap_memory(mm->logical_address, > > > > + mm->length); > > > > + ACPI_FREE(mm); > > > > } > > > > ACPI_FREE(local_region_context); > > > > *region_context = NULL; > > > > diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c > > > > index d15a66de26c0..fd68f2134804 100644 > > > > --- a/drivers/acpi/acpica/exregion.c > > > > +++ b/drivers/acpi/acpica/exregion.c > > > > @@ -41,6 +41,7 @@ acpi_ex_system_memory_space_handler(u32 function, > > > > acpi_status status = AE_OK; > > > > void *logical_addr_ptr = NULL; > > > > struct acpi_mem_space_context *mem_info = region_context; > > > > + struct acpi_mem_mapping *mm = mem_info->cur_mm; > > > > u32 length; > > > > acpi_size map_length; > > > > > > I think this needs to be: > > > > > > acpi_size map_length = mem_info->length; > > > > > > since it now gets used in the ACPI_ERROR() call below. > > > > No, it's better to print the length value in the message. > > Yeah, that was the other option. > > > > I'm getting a "maybe used unitialized" error on compilation. > > > > Thanks for reporting! > > > > I've updated the commit in the acpica-osl branch with the fix. > > Thanks, Rafael. > > Do you have a generic way of testing this? I can see a way to do it > -- timing a call of a method in a dynamically loaded SSDT -- but if > you had a test case laying around, I could continue to be lazy :). I don't check the timing, but instrument the code to see if what happens is what is expected. Now, the overhead reduction resulting from this change in Linux is quite straightforward: Every time the current mapping doesn't cover the request at hand, an unmap is carried out by the original code, which involves a linear search through acpi_ioremaps, and which generally is (at least a bit) more expensive than the linear search through the list of opregion-specific mappings introduced by the $subject patch, because quite likely the acpi_ioremaps list holds more items. And, of course, if the opregion in question holds many fields and they are not covered by one mapping, each of them needs to be mapped just once per the opregion life cycle.