+ kexec_core-accept-unaccepted-kexec-segments-destination-addresses.patch added to mm-nonmm-unstable branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The patch titled
     Subject: kexec_core: accept unaccepted kexec segments' destination addresses
has been added to the -mm mm-nonmm-unstable branch.  Its filename is
     kexec_core-accept-unaccepted-kexec-segments-destination-addresses.patch

This patch will shortly appear at
     https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/kexec_core-accept-unaccepted-kexec-segments-destination-addresses.patch

This patch will later appear in the mm-nonmm-unstable branch at
    git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Before you just go and hit "reply", please:
   a) Consider who else should be cc'ed
   b) Prefer to cc a suitable mailing list as well
   c) Ideally: find the original patch on the mailing list and do a
      reply-to-all to that, adding suitable additional cc's

*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***

The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days

------------------------------------------------------
From: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Subject: kexec_core: accept unaccepted kexec segments' destination addresses
Date: Fri, 13 Dec 2024 17:54:49 +0800

In TDX, to run a linux guest, TDs (hardware-isolated VMs) must accept
before accessing private memory.  Accessing private memory before
acceptance is considered a fatal error and may result in the termination
of the TD.

The "accepting memory" operation in guest includes the following steps:
- trigger a VM-exit
- the host OS allocates a physical page and requests hardware to map the
  physical page to the GPA.
- initialize memory content to 0.
- encrypt the memory

For a Linux guest, eagerly accepting all memory during kernel boot can
slow down the boot process and cause unnecessary memory occupation on the
host for pages that may never be accessed.  Therefore, Linux guests
usually opt for a lazy mode to delay page acceptance operations by not
moving the pages to the buddy allocator's freelists.  Instead, the kernel
tracks memory in 4M units and places them in a zone->unaccepted_pages list
if any page in the entire 4M range is in an unaccepted state (even if part
of the memory range may have been accepted by firmware or the kernel). 
When the kernel does not have enough free pages, it will move memory from
the zone->unaccepted_pages list and accept it, ensuring that the memory is
accepted before moving it to the freelists and being available to the
buddy allocator.

The kexec segments' destination addresses are not allocated by the buddy
allocator.  Instead, they are searched from normal system RAM (top-down or
bottom-up) and exclude driver-managed memory, ACPI, persistent, and
reserved memory...  Although these addresses may fall within the memory
range managed by the buddy allocator (which must be in an accepted state),
they could also be outside that range and in an unaccepted state.

Since the kexec code will access the segments' destination addresses
during the kexec process by swapping their content with the segments'
source pages, it is necessary to accept the memory before performing the
swap operations.

Accept the destination addresses during the kexec load, immediately after
they pass sanity checks.  This ensures the code is located in a common
place shared by both the kexec_load and kexec_file_load system calls.

This will not conflict with the accounting in try_to_accept_memory_one()
since the accounting is set during kernel boot and decremented when pages
are moved to the freelists.  There is no harm in invoking accept_memory()
on a page before making it available to the buddy allocator.

No need to worry about re-accepting memory since accept_memory() checks
the unaccepted bitmap before accepting a memory page.

Although a user may perform kexec loading without ever triggering the
jump, it doesn't impact much since kexec loading is not in a
performance-critical path.  Additionally, the destination addresses are
always searched and found in the same location on a given system.

Changes to the destination address searching logic to locate only memory
in either unaccepted or accepted status are unnecessary and complicated.

Link: https://lkml.kernel.org/r/20241213095449.881-1-yan.y.zhao@xxxxxxxxx
Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Baoquan He <bhe@xxxxxxxxxx>
Cc: Jianxiong Gao <jxgao@xxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxx>
Cc: Baoquan he <bhe@xxxxxxxxxx>
Cc: "Edgecombe, Rick P" <rick.p.edgecombe@xxxxxxxxx>
Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
Cc: Kirill A. Shuemov <kirill.shutemov@xxxxxxxxxxxxxxx>
Cc: Yan Zhao <yan.y.zhao@xxxxxxxxx>
Cc: Ashish Kalra <Ashish.Kalra@xxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
---

 kernel/kexec_core.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

--- a/kernel/kexec_core.c~kexec_core-accept-unaccepted-kexec-segments-destination-addresses
+++ a/kernel/kexec_core.c
@@ -210,6 +210,16 @@ int sanity_check_segment_list(struct kim
 	}
 #endif
 
+	/*
+	 * The destination addresses are searched from system RAM rather than
+	 * being allocated from the buddy allocator, so they are not guaranteed
+	 * to be accepted by the current kernel.  Accept the destination
+	 * addresses before kexec swaps their content with the segments' source
+	 * pages to avoid accessing memory before it is accepted.
+	 */
+	for (i = 0; i < nr_segments; i++)
+		accept_memory(image->segment[i].mem, image->segment[i].memsz);
+
 	return 0;
 }
 
_

Patches currently in -mm which might be from yan.y.zhao@xxxxxxxxx are

kexec_core-accept-unaccepted-kexec-segments-destination-addresses.patch





[Index of Archives]     [Kernel Archive]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]

  Powered by Linux