On Fri, Apr 10, 2020 at 05:26:34PM +0200, Ard Biesheuvel wrote: > On Fri, 10 Apr 2020 at 16:48, Arvind Sankar <nivedita@xxxxxxxxxxxx> wrote: > > On Thu, Apr 09, 2020 at 12:35:30PM -0400, Arvind Sankar wrote: > > > On Thu, Apr 09, 2020 at 04:47:55PM +0200, Ard Biesheuvel wrote: > > > > On Thu, 9 Apr 2020 at 16:39, Arvind Sankar <nivedita@xxxxxxxxxxxx> wrote: > > > > > > > > > > On Wed, Apr 08, 2020 at 09:49:15AM +0200, Ard Biesheuvel wrote: > > > > > > (add Peter, Leif and Daniel) > > > > > > > > > > > > On Wed, 8 Apr 2020 at 09:43, Dave Young <dyoung@xxxxxxxxxx> wrote: > > > > > > > > > > > > > > On 04/06/20 at 02:06pm, Arvind Sankar wrote: > > > > > > > > Commit > > > > > > > > > > > > > > > > 3ee372ccce4d ("x86/boot/compressed/64: Remove .bss/.pgtable from > > > > > > > > bzImage") > > > > > > > > > > > > > > > > removed the .bss section from the bzImage. > > > > > > > > > > > > > > > > However, while a PE loader is required to zero-initialize the .bss > > > > > > > > section before calling the PE entry point, the EFI handover protocol > > > > > > > > does not currently document any requirement that .bss be initialized by > > > > > > > > the bootloader prior to calling the handover entry. > > > > > > > > > > > > > > > > When systemd-boot is used to boot a unified kernel image [1], the image > > > > > > > > is constructed by embedding the bzImage as a .linux section in a PE > > > > > > > > executable that contains a small stub loader from systemd together with > > > > > > > > additional sections and potentially an initrd. As the .bss section > > > > > > > > within the bzImage is no longer explicitly present as part of the file, > > > > > > > > it is not initialized before calling the EFI handover entry. > > > > > > > > Furthermore, as the size of the embedded .linux section is only the size > > > > > > > > of the bzImage file itself, the .bss section's memory may not even have > > > > > > > > been allocated. > > > > > > > > > > > > > > I did not follow up the old report, maybe I missed something. But not > > > > > > > sure why only systemd-boot is mentioned here. I also have similar issue > > > > > > > with early efi failure. With these two patches applied, it works well > > > > > > > then. > > > > > > > > > > > > > > BTW, I use Fedora 31 + Grub2 > > > > > > > > > > > > > > > > > > > OK, so I take it this means that GRUB's PE/COFF loader does not > > > > > > zero-initialize BSS either? Does it honor the image size in memory if > > > > > > it exceeds the file size? > > > > > > > > > > Dave, that comment was because the previous report was for systemd-boot > > > > > stub. > > > > > > > > > > Ard, should I revise the commit message to make it clear it's not > > > > > restricted to systemd-boot but anything using handover entry may be > > > > > affected? Maybe just a "for example, when systemd-boot..." and then a > > > > > line to say grub2 with the EFI stub patches is also impacted? > > > > > > > > > > > > > Well, the fact the /some/ piece of software is used in production that > > > > relies on the ill-defined EFI handover protocol is sufficient > > > > justification, so I don't think it is hugely important to update it. > > > > > > > > > https://src.fedoraproject.org/rpms/grub2/blob/f31/f/0001-Add-support-for-Linux-EFI-stub-loading.patch#_743 > > > > > > > > > > + kernel_mem = grub_efi_allocate_pages_max(lh.pref_address, > > > > > + BYTES_TO_PAGES(lh.init_size)); > > > > > > > > > > Looking at this, grub does allocate init_size for the image, but it > > > > > doesn't zero it out. > > > > > > > > > > This call also looks wrong to me though. It allocates at max address of > > > > > pref_address, which, if it succeeds, will guarantee that the kernel gets > > > > > loaded entirely below pref_address == LOAD_PHYSICAL_ADDR. In native > > > > > mode, if it weren't for the EFI stub copying the kernel again, this > > > > > would cause the startup code to relocate the kernel into unallocated > > > > > memory. On a mixed-mode boot, this would cause the early page tables > > > > > setup prior to transitioning to 64-bit mode to be in unallocated memory > > > > > and potentially get clobbered by the EFI stub. > > > > > > > > > > The first try to allocate pref_address should be calling > > > > > grub_efi_allocate_fixed instead. > > > > > > > > Thanks Arvind. I'm sure the Fedora/RedHat folks on cc should be able > > > > to get these logged somewhere. > > > > > > Ok. For dracut, the process for building the unified kernel image needs > > > a check to make sure the kernel can fit in the space provided for it -- > > > there is 16MiB of space and the distro bzImage's are up to 10-11MiB in > > > size, so there's some slack left at present. > > > > > > Additionally, in mixed-mode, the unified kernel images are quite likely > > > to end up with early pgtables from startup_32 clobbering the initrd, > > > independently of the recent kernel changes. Hopefully no-one actually > > > uses these in mixed-mode. > > > > The grub EFI handover entry patch is busted in mixed-mode for another > > reason -- while it allocates init_size, it doesn't use the correct > > alignment. I tested on a Debian buster VM in mixed-mode (that was the > > one I was able to get to install/boot with mixed-mode), and the early > > pagetable from startup_32 ends up in unallocated memory due to the > > rounding up of the bzImage address to account for kernel alignment. This > > would be an existing problem prior to these patches. > > > > Should we try to handle this in the kernel? At some point KASLR is going > > to pick that memory for the kernel and overwrite the pagetables I would > > think, resulting in sporadic crashes that are almost unreproducible. > > Upstream GRUB does not implement the EFI handover protocol at all, and > the distros all have their own GRUB forks that implement this along > with mixed mode, secure boot, shim, measured boot etc. Exactly... > What you are saying is that GRUB forks turn out to exist that violate > both the PE/COFF specification and the Linux/x86 boot protocol in a > way that might break mixed mode, and nobody noticed until you happened > to find it by code inspection. While I appreciate the effort, I think > this is where I would like to draw the line, and say that there is > only so much we can do to work around bugs in out-of-tree forks of > other projects. So unless it can be done cleanly and without losing > any of the benefits of the recent cleanup and optimization work, I'd > say don't bother. I fully agree! Daniel