On 1 August 2018 at 19:07, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote: > On 8/1/2018 10:27 AM, Eric Snowberg wrote: >> >> >>> On Aug 1, 2018, at 9:52 AM, Jeffrey Hugo <jhugo@xxxxxxxxxxxxxx> wrote: >>> >>> On 8/1/2018 8:56 AM, Eric Snowberg wrote: >>>> >>>> Commit d2078d5adbe2 ("x86: EFI stub support for large memory maps") >>>> added support for EFI memory maps larger than 128 entries. Following >>>> allocation for this new buffer (within the exit_boot function calling >>>> alloc_e820ext) another call to efi_get_memory_map was performed. >>>> This took place before the final call to ExitBootServices (EBS) to >>>> insure the memory map buffer was large enough since memory allocation >>>> is not allowed after calling EBS. >>>> Later this code was refactored and portions were added to >>>> efi-stub-helper.c. The call to efi_get_memory_map did not remain after >>>> calling alloc_e820ext. In a later patch, a new EFI_MMAP_NR_SLACK_SLOTS >>>> was >>>> introduced to help give some headroom should the memory map grow after >>>> the EBS is called. This does not alway provide enough space with large >>>> memory maps. >>> >>> It might be nice if you CC'd the authors of the changes you are "fixing" >>> so that they can chime in. I guess I'm lucky I noticed this on the list. >>> You should probably also have a "fixes tag”. >> >> >> Sorry I missed you. >> >>> >>> Can you elaborate on the specific scenario where EFI_MMAP_NR_SLACK_CLOTS >>> is insufficient? What is the exact failure encountered? >> >> >> A hard hang. > > > Sorry, need more detail. Hang, as in the system doesn't boot? 100% of the > time? Randomly? > > Sounds like, based on what you've given so far, you are hitting the alloc > path, which changes the memory map. EBS fails. We go into the retry path, > but there isn't enough buffer space, so then the priv_func fails, resulting > in a failed boot. Since EBS has already run, you probably don't see any > more prints. > > Can you confirm my assumptions? > > You haven't mentioned what system config generates this issue, nor what > constitutes a "large memory map" where 8 allocation slots is insufficient. > >> >>> >>>> This change will call efi_get_memory_map one final time before calling >>>> EBS, >>>> ensuring there is enough room for the memory map buffer. The additional >>>> headroom will be left alone for any memory change that may currently >>>> be in flight between the final efi_get_memory_map call and EBS. >>> >>> >>> Uhh, I'm thinking this should be Nack'd. >>> >>> This invalidates the entire point of the priv_func because now the memory >>> map may have changed after it was processed by priv_func, but the mem map >>> may not have changed between your added call and EBS. If that happens, EBS >>> will pass, and we'll go into linux with a corrupt mem map. >> >> >> How would we go into Linux with a corrupt memory map? > > > Have you looked at the other places where priv_func is declared? Hint - > https://elixir.bootlin.com/linux/latest/source/drivers/firmware/efi/libstub/fdt.c#L217 > > I expect your change will break the ARM64 path, which means it'll break my > platform for my customers. Something I'd like to avoid. > > Lets try to find a solution that works for your problem, but also works for > me. > >> >>> The mem map can still change between your added call and EBS, so you >>> haven't really done anything to address the existing race condition. >> >> >> This patch is not trying to fix a race condition. > > > Your change is impacting a workaround to a race condition that is specified > in the UEFI spec. Did you look at the list discussions circa 2016 which > discuss the various facets of the code you are attempting to modify? I'm > not claiming the code is perfect, but so far, it seems like you don't > completely understand the history and tradeoffs that were accepted, > therefore it seems like you may be undoing that "fix". > > Everything is a race condition in the part of code you are touching. The EFI > stub isn't the only thing active at this point in time in the system state. > The FW itself (UEFI) is still active, and can be doing things up until the > point EBS is called. Therefore there are very few assumptions you can make. > To clarify: between the last call to GetMemoryMap() and the first invocation of ExitBootServices(), event handlers may fire that allocate or free memory, changing the state of the memory map and thus the map key, resulting in EBS() to return failure. However, the first thing EBS() does is disable event dispatching (even if it fails further on), so this failure can be mitigated by calling EBS() a second time. This was the purpose of Jeffrey's contribution in 2016. >> The priv_func is exit_boot_func on x86. exit_boot_func can allocate much >> more memory than the approx 384 bytes set aside for slack space (thru >> alloc_e820ext). > > > Please elaborate on the specific, real world scenario where this is > possible. With numbers. > I know that there are x86 systems with an insanely fragmented memory map. Can you share the output of memmap (called from the UEFI shell) on such as system? >> Once EBS is called, there isn’t enough room available for the new memory >> map on a very large system (do to the memory that was previously allocated >> in exit_boot_func). >> >> The race condition that exists between priv_func and the call to EBS will >> be taken care of by the existing code, thru the slack slots. > > > Now I'm slightly confused. There isn't an issue because of slack slots? > > I'm kind of wondering if our assumption that 8 slots would be enough is not > invalid, and therefore we need to increase that number. I hate kicking the > can down the road like that, but as discussed in 2016, there doesn't seem to > be a good alternative. > But why does alloc_e820ext have to be deferred to the last minute? Surely, we can find a reasonable upper bound and do the allocation [if necessary] before calling efi_exit_boot_services() ? -- To unsubscribe from this list: send the line "unsubscribe linux-efi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html