On Fri, Mar 07, 2025 at 07:10:46AM +0200, Jarkko Sakkinen wrote: > On Thu, Dec 19, 2024 at 11:41:58AM -0800, Ross Philipson wrote: > > From: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx> > > > > Introduce background, overview and configuration/ABI information > > for the Secure Launch kernel feature. > > > > Signed-off-by: Daniel P. Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx> > > Signed-off-by: Ross Philipson <ross.philipson@xxxxxxxxxx> > > Reviewed-by: Bagas Sanjaya <bagasdotme@xxxxxxxxx> > > --- > > Documentation/security/index.rst | 1 + > > .../security/launch-integrity/index.rst | 11 + > > .../security/launch-integrity/principles.rst | 317 ++++++++++ > > .../secure_launch_details.rst | 587 ++++++++++++++++++ > > .../secure_launch_overview.rst | 252 ++++++++ > > 5 files changed, 1168 insertions(+) > > create mode 100644 Documentation/security/launch-integrity/index.rst > > create mode 100644 Documentation/security/launch-integrity/principles.rst > > create mode 100644 Documentation/security/launch-integrity/secure_launch_details.rst > > create mode 100644 Documentation/security/launch-integrity/secure_launch_overview.rst > > > > diff --git a/Documentation/security/index.rst b/Documentation/security/index.rst > > index 3e0a7114a862..f89741271ed0 100644 > > --- a/Documentation/security/index.rst > > +++ b/Documentation/security/index.rst > > @@ -20,3 +20,4 @@ Security Documentation > > landlock > > secrets/index > > ipe > > + launch-integrity/index > > diff --git a/Documentation/security/launch-integrity/index.rst b/Documentation/security/launch-integrity/index.rst > > new file mode 100644 > > index 000000000000..838328186dd2 > > --- /dev/null > > +++ b/Documentation/security/launch-integrity/index.rst > > @@ -0,0 +1,11 @@ > > +===================================== > > +System Launch Integrity documentation > > +===================================== > > + > > +.. toctree:: > > + :maxdepth: 1 > > + > > + principles > > + secure_launch_overview > > + secure_launch_details > > + > > diff --git a/Documentation/security/launch-integrity/principles.rst b/Documentation/security/launch-integrity/principles.rst > > new file mode 100644 > > index 000000000000..a0553d1d93c2 > > --- /dev/null > > +++ b/Documentation/security/launch-integrity/principles.rst > > @@ -0,0 +1,317 @@ > > +.. SPDX-License-Identifier: GPL-2.0 > > +.. Copyright (c) 2019-2024 Daniel P. Smith <dpsmith@xxxxxxxxxxxxxxxxxxxx> > > + > > +======================= > > +System Launch Integrity > > +======================= > > + > > +:Author: Daniel P. Smith > > +:Date: August 2024 > > + > > +This document serves to establish a common understanding of what a system > > +launch is, the integrity concern for system launch, and why using a Root of Trust > > +(RoT) from a Dynamic Launch may be desirable. Throughout this document, > > +terminology from the Trusted Computing Group (TCG) and National Institute for > > +Standards and Technology (NIST) is used to ensure that vendor natural language is > > +used to describe and reference security-related concepts. > > + > > +System Launch > > +============= > > + > > +There is a tendency to only consider the classical power-on boot as the only > > +means to launch an Operating System (OS) on a computer system. In fact, most > > +modern processors support two system launch methods. To provide clarity, > > +it is important to establish a common definition of a system launch: during > > +a single power life cycle of a system, a system launch consists of an initialization > > +event, typically in hardware, that is followed by an executing software payload > > +that takes the system from the initialized state to a running state. Driven by > > +the Trusted Computing Group (TCG) architecture, modern processors are able to > > +support two methods of system launch. These two methods of system launch are known > > +as Static Launch and Dynamic Launch. > > + > > +Static Launch > > +------------- > > + > > +Static launch is the system launch associated with the power cycle of the CPU. > > +Thus, static launch refers to the classical power-on boot where the > > +initialization event is the release of the CPU from reset and the system > > +firmware is the software payload that brings the system up to a running state. > > +Since static launch is the system launch associated with the beginning of the > > +power lifecycle of a system, it is therefore a fixed, one-time system launch. > > +It is because of this that static launch is referred to and thought of as being > > +"static". > > + > > +Dynamic Launch > > +-------------- > > + > > +Modern CPUs architectures provides a mechanism to re-initialize the system to a > > +"known good" state without requiring a power event. This re-initialization > > +event is the event for a dynamic launch and is referred to as the Dynamic > > +Launch Event (DLE). The DLE functions by accepting a software payload, referred > > +to as the Dynamic Configuration Environment (DCE), that execution is handed to > > +after the DLE is invoked. The DCE is responsible for bringing the system back > > +to a running state. Since the dynamic launch is not tied to a power event like > > +the static launch, this enables a dynamic launch to be initiated at any time > > +and multiple times during a single power life cycle. This dynamism is the > > +reasoning behind referring to this system launch as "dynamic". > > + > > +Because a dynamic launch can be conducted at any time during a single power > > +life cycle, they are classified into one of two types: an early launch or a > > +late launch. > > + > > +:Early Launch: When a dynamic launch is used as a transition from a static > > + launch chain to the final Operating System. > > + > > +:Late Launch: The usage of a dynamic launch by an executing Operating System to > > + transition to a "known good" state to perform one or more operations, e.g. to > > + launch into a new Operating System. > > + > > +System Integrity > > +================ > > + > > +A computer system can be considered a collection of mechanisms that work > > +together to produce a result. The assurance that the mechanisms are functioning > > +correctly and producing the expected result is the integrity of the system. To > > +ensure a system's integrity, there is a subset of these mechanisms, commonly > > +referred to as security mechanisms, that is present to help ensure the system > > +produces the expected result or at least detects the potential of an unexpected > > +result. Since the security mechanisms are relied upon to ensue the integrity of > > +the system, these mechanisms are trusted. Upon inspection, these security > > +mechanisms each have a set of properties and these properties can be evaluated > > +to determine how susceptible a mechanism might be to failure. This assessment is > > +referred to as the Strength of Mechanism, which allows the trustworthiness of > > +that mechanism to be quantified. > > + > > +For software systems, there are two system states for which the integrity is > > +critical: when the software is loaded into memory and when the software is > > +executing on the hardware. Ensuring that the expected software is loaded into > > +memory is referred to as load-time integrity while ensuring that the software > > +executing is the expected software is the runtime integrity of that software. > > I'd consider deleting the first paragraph. It really does not provide > anything useful. The 2nd paragraph is totally sufficient introduction to > the topic, and makes factors more sense. > > We don't need a phrase in kernel documentation stating that computer is > a system that produces a result :-) > > Should be at least easy enough change to make. I don't think it even > needs any refined version as the text below provides more than enough > (in many places useful) detail to the topic. > > > + > > +Load-time Integrity > > +------------------- > > + > > +It is critical to understand what load-time integrity establishes about a > > +system and what is assumed, i.e. what is being trusted. Load-time integrity is > > I'd delete the very first sentence completely. It serves zero purpose. > This would be so much less exhausting read if I could just start on > getting the information what load-time integrity is. > > Reassurance serves zero purpose. It is up to the read of kernel > documentation to make such evaluation. > > > +when a trusted entity, i.e. an entity with an assumed integrity, takes an > > +action to assess an entity being loaded into memory before it is used. A > > +variety of mechanisms may be used to conduct the assessment, each with > > +different properties. A particular property is whether the mechanism creates an > > +evidence of the assessment. Often either cryptographic signature checking or > > +hashing are the common assessment operations used. > > + > > +A signature checking assessment functions by requiring a representation of the > > +accepted authorities and uses those representations to assess if the entity has > > +been signed by an accepted authority. The benefit to this process is that > > +assessment process includes an adjudication of the assessment. The drawbacks > > +are that 1) the adjudication is susceptible to tampering by the Trusted > > +Computing Base (TCB), 2) there is no evidence to assert that an untampered > > +adjudication was completed, and 3) the system must be an active participant in > > +the key management infrastructure. > > + > > +A cryptographic hashing assessment does not adjudicate the assessment, but > > This is actually language barrier: is "cryptographic hashing assesment" > same as "cryptographic measurement"? I'd consider using latter as it has > wider reach. Most people know what measurement means if they know any of > cryptography. > > > +instead generates evidence of the assessment to be adjudicated independently. > > +The benefits to this approach is that the assessment may be simple such that it > > +may be implemented in an immutable mechanism, e.g. in hardware. Additionally, > > +it is possible for the adjudication to be conducted where it cannot be tampered > > +with by the TCB. The drawback is that a compromised environment will be allowed > > +to execute until an adjudication can be completed. > > + > > +Ultimately, load-time integrity provides confidence that the correct entity was > > +loaded and in the absence of a run-time integrity mechanism assumes, i.e. > > +trusts, that the entity will never become corrupted. > > + > > +Runtime Integrity > > +----------------- > > + > > +Runtime integrity in the general sense is when a trusted entity makes an > > +assessment of an entity at any point in time during the assessed entity's > > +execution. A more concrete explanation is the taking of an integrity assessment > > Great, this is better than the last subsection as it gets straight into > the topic! No reassurance part ;-) > > > +of an active process executing on the system at any point during the process' > > +execution. Often the load-time integrity of an operating system's user-space, > > +i.e. the operating environment, is confused with the runtime integrity of the > > +system, since it is an integrity assessment of the "runtime" software. The > > +reality is that actual runtime integrity is a very difficult problem and thus > > +not very many solutions are public and/or available. One example of a runtime > > +integrity solution would be Johns Hopkins Advanced Physics Laboratory's (APL) > > +Linux Kernel Integrity Module (LKIM). > > + > > +Trust Chains > > +============ > > + > > +Building upon the understanding of security mechanisms to establish load-time > > +integrity of an entity, it is possible to chain together load-time integrity > > +assessments to establish the integrity of the whole system. This process is > > +known as transitive trust and provides the concept of building a chain of > > +load-time integrity assessments, commonly referred to as a trust chain. These > > +assessments may be used to adjudicate the load-time integrity of the whole > > +system. This trust chain is started by a trusted entity that does the first > > +assessment. This first entity is referred to as the Root of Trust(RoT) with the > > +entities name being derived from the mechanism used for the assessment, i.e. > > +RoT for Verification (RTV) and RoT for Measurement (RTM). > > + > > +A trust chain is itself a mechanism, specifically a mechanism of mechanisms, > > +and therefore it also has a Strength of Mechanism. The factors that contribute > > +to the strength of a trust chain are: > > + > > + - The strength of the chain's RoT > > + - The strength of each member of the trust chain > > + - The length, i.e. the number of members, of the chain > > + > > +Therefore, the strongest trust chains should start with a strong RoT and should > > +consist of members being of low complexity and minimize the number of members > > +participating. In a more colloquial sense, a trust chain is only as strong as its > > +weakest link, thus more links increase the probability of a weak link. > > + > > +Dynamic Launch Components > > +========================= > > + > > +The TCG architecture for dynamic launch is composed of a component series > > +used to set up and then carry out the launch. These components work together to > > +construct an RTM trust chain that is rooted in the dynamic launch and thus commonly > > +referred to as the Dynamic Root of Trust for Measurement (DRTM) chain. > > + > > +What follows is a brief explanation of each component in execution order. A > > +subset of these components are what establishes the dynamic launch's trust > > +chain. > > + > > +Dynamic Configuration Environment Preamble > > +------------------------------------------ > > + > > +The Dynamic Configuration Environment (DCE) Preamble is responsible for setting > > +up the system environment in preparation for a dynamic launch. The DCE Preamble > > +is not a part of the DRTM trust chain. > > + > > +Dynamic Launch Event > > +-------------------- > > + > > +The dynamic launch event is the event, typically a CPU instruction, that > > +triggers the system's dynamic launch mechanism to begin the launch process. The > > +dynamic launch mechanism is also the RoT for the DRTM trust chain. > > + > > +Dynamic Configuration Environment > > +--------------------------------- > > + > > +The dynamic launch mechanism may have resulted in a reset of a portion of the > > +system. To bring the system back to an adequate state for system software, the > > +dynamic launch will hand over control to the DCE. Prior to handing over this > > +control, the dynamic launch will measure the DCE. Once the DCE is complete, it > > +will proceed to measure and then execute the Dynamic Launch Measured > > +Environment (DLME). > > + > > +Dynamic Launch Measured Environment > > +----------------------------------- > > + > > +The DLME is the first system kernel to have control of the system, but may not > > +be the last. Depending on the usage and configuration, the DLME may be the > > +final/target operating system, or it may be a bootloader that will load the > > +final/target operating system. > > + > > +Why DRTM > > +======== > > Nit: maybe > > Why DTRM? > ========= > > > > + > > +It is a fact that DRTM increases the load-time integrity of the system by > > +providing a trust chain that has an immutable hardware RoT, uses a limited > > +number of small, special purpose code to establish the trust chain that starts > > +the target operating system. As mentioned in the Trust Chain section, these are > > +the main three factors in driving up the strength of a trust chain. As has been > > +seen with the BootHole exploit, which in fact did not affect the integrity of > > +DRTM solutions, the sophistication of attacks targeting system launch is at an > > +all-time high. There is no reason a system should not employ every available > > +hardware integrity measure. This is the crux of a defense-in-depth > > +approach to system security. In the past, the now closed SMI gap was often > > +pointed to as invalidating DRTM, which in fact was nothing but a straw man > > +argument. As has continued to be demonstrated, if/when SMM is corrupted, it can > > +always circumvent all load-time integrity (SRTM and DRTM) because it is a > > +run-time integrity problem. Regardless, Intel and AMD have both deployed > > +runtime integrity for SMI and SMM which is tied directly to DRTM such that this > > +perceived deficiency is now non-existent and the world is moving forward with > > +an expectation that DRTM must be present. > > Here's my general feeling about text up to this point. It's way too > verbose and has bad reach especially for non-native speakers. > > I don't want nitpick every possible sentence that I think could be > made for punctual. > > What I'd suggest instead would be to go through this internalla at > Oracle with some group of people couple of times and try to cut out > all the extra fat. > > I gave those review comments in order to give an idea what kind of > stuff look up for. The benefit is that if you get this document more > readable that also as a side-effect lowers the barrier to review the > patch series. Right now this is more exhausting to read than some of > the actualy science papers I've read. > > Hope no one takes this personally. What comes after this is much better > fit but I'd still do similar assessment. > > Roughly estimated you could have a document 50% of the current length > without loss of information content just by being a factor more > punctual. I'm worried that the series gets ignored partly because > the documentation is already like climbing to a mountain. I want to soften this by saying that based purely on the information content this is one of the best description of how D-RTM works I've read but that is not same as saying that it would be best write up. So a few editing rounds making text more tight and it'll be perfect. BR, Jarkko