On 8/1/2018 10:27 AM, Eric Snowberg wrote:
On Aug 1, 2018, at 9:52 AM, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote:
On 8/1/2018 8:56 AM, Eric Snowberg wrote:
Commit d2078d5adbe2 ("x86: EFI stub support for large memory maps")
added support for EFI memory maps larger than 128 entries. Following
allocation for this new buffer (within the exit_boot function calling
alloc_e820ext) another call to efi_get_memory_map was performed.
This took place before the final call to ExitBootServices (EBS) to
insure the memory map buffer was large enough since memory allocation
is not allowed after calling EBS.
Later this code was refactored and portions were added to
efi-stub-helper.c. The call to efi_get_memory_map did not remain after
calling alloc_e820ext. In a later patch, a new EFI_MMAP_NR_SLACK_SLOTS was
introduced to help give some headroom should the memory map grow after
the EBS is called. This does not alway provide enough space with large
memory maps.
It might be nice if you CC'd the authors of the changes you are "fixing" so that they can chime in. I guess I'm lucky I noticed this on the list. You should probably also have a "fixes tag”.
Sorry I missed you.
Can you elaborate on the specific scenario where EFI_MMAP_NR_SLACK_CLOTS is insufficient? What is the exact failure encountered?
A hard hang.
Sorry, need more detail. Hang, as in the system doesn't boot? 100% of
the time? Randomly?
Sounds like, based on what you've given so far, you are hitting the
alloc path, which changes the memory map. EBS fails. We go into the
retry path, but there isn't enough buffer space, so then the priv_func
fails, resulting in a failed boot. Since EBS has already run, you
probably don't see any more prints.
Can you confirm my assumptions?
You haven't mentioned what system config generates this issue, nor what
constitutes a "large memory map" where 8 allocation slots is insufficient.
This change will call efi_get_memory_map one final time before calling EBS,
ensuring there is enough room for the memory map buffer. The additional
headroom will be left alone for any memory change that may currently
be in flight between the final efi_get_memory_map call and EBS.
Uhh, I'm thinking this should be Nack'd.
This invalidates the entire point of the priv_func because now the memory map may have changed after it was processed by priv_func, but the mem map may not have changed between your added call and EBS. If that happens, EBS will pass, and we'll go into linux with a corrupt mem map.
How would we go into Linux with a corrupt memory map?
Have you looked at the other places where priv_func is declared? Hint -
https://elixir.bootlin.com/linux/latest/source/drivers/firmware/efi/libstub/fdt.c#L217
I expect your change will break the ARM64 path, which means it'll break
my platform for my customers. Something I'd like to avoid.
Lets try to find a solution that works for your problem, but also works
for me.
The mem map can still change between your added call and EBS, so you haven't really done anything to address the existing race condition.
This patch is not trying to fix a race condition.
Your change is impacting a workaround to a race condition that is
specified in the UEFI spec. Did you look at the list discussions circa
2016 which discuss the various facets of the code you are attempting to
modify? I'm not claiming the code is perfect, but so far, it seems like
you don't completely understand the history and tradeoffs that were
accepted, therefore it seems like you may be undoing that "fix".
Everything is a race condition in the part of code you are touching.
The EFI stub isn't the only thing active at this point in time in the
system state. The FW itself (UEFI) is still active, and can be doing
things up until the point EBS is called. Therefore there are very few
assumptions you can make.
The priv_func is exit_boot_func on x86. exit_boot_func can allocate much more memory than the approx 384 bytes set aside for slack space (thru alloc_e820ext).
Please elaborate on the specific, real world scenario where this is
possible. With numbers.
Once EBS is called, there isn’t enough room available for the new memory map on a very large system (do to the memory that was previously allocated in exit_boot_func).
The race condition that exists between priv_func and the call to EBS will be taken care of by the existing code, thru the slack slots.
Now I'm slightly confused. There isn't an issue because of slack slots?
I'm kind of wondering if our assumption that 8 slots would be enough is
not invalid, and therefore we need to increase that number. I hate
kicking the can down the road like that, but as discussed in 2016, there
doesn't seem to be a good alternative.
Signed-off-by: Eric Snowberg <eric.snowberg@xxxxxxxxxx>
---
drivers/firmware/efi/libstub/efi-stub-helper.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/firmware/efi/libstub/efi-stub-helper.c b/drivers/firmware/efi/libstub/efi-stub-helper.c
index b018436..3363b4e 100644
--- a/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ b/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -845,6 +845,14 @@ efi_status_t efi_exit_boot_services(efi_system_table_t *sys_table_arg,
if (status != EFI_SUCCESS)
goto free_map;
+ /*
+ * priv_func could have allocated memory, thus call efi_get_memory_map
+ * one more time.
+ */
+ status = efi_get_memory_map(sys_table_arg, map);
+ if (status != EFI_SUCCESS)
+ goto fail;
+
status = efi_call_early(exit_boot_services, handle, *map->key_ptr);
if (status == EFI_INVALID_PARAMETER) {
--
Jeffrey Hugo
Qualcomm Datacenter Technologies as an affiliate of Qualcomm
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the
Code Aurora Forum, a Linux Foundation Collaborative Project.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html