Hello Sean, On 4/26/23 8:32 AM, Reshetova, Elena wrote: > Hi Sean, > > Thank you for your review! Please see my comments inline. > >> On Mon, Mar 27, 2023, Carlos Bilbao wrote: >>> +Kernel developers working on confidential computing for the cloud operate >>> +under a set of assumptions regarding the Linux kernel threat model that >>> +differ from the traditional view. In order to effectively engage with the >>> +linux-coco mailing list and contribute to its initiatives, one must have a >>> +thorough familiarity with these concepts. This document provides a concise, >>> +architecture-agnostic introduction to help developers gain a foundational >> >> Heh, vendor agnostic maybe, but certainly not architecture agnostic. > > I guess it depends where you draw a distinction between vendor and architecture. > What was meant here is that we try to write down the overall threat model > and high-level design that existing technologies use today. > But I don’t mind change to vendor agnostic, if it seems more correct. > >> >>> +understanding of the subject. >>> + >>> +Overview and terminology >>> +======================== >>> + >>> +Confidential Cloud Computing (CoCo) refers to a set of HW and SW >> >> As per Documentation/security/secrets/coco.rst and every discussion I've >> observed, >> CoCo is Confidential Computing. "Cloud" is not part of the definition. That's >> true even if this discussion is restricted to CoCo VMs, e.g. see pKVM. > > Yes, I personally not sure we have a single good term to describe this particular > angle of confidential computing. Generally Confidential Computing can mean > any CoCo technology, including things that do not relate to virtualization (like SGX). > This document doesn’t attempt to cover all CoCo, but only a subset of them that > relates to virtualization. Academia researches have been using term "Confidential Cloud > Computing" (quick search on google scholar gives relevant papers), so this was > a reason to adapt this term. If you have a better proposal, please tell. > >> >>> +virtualization technologies that allow Cloud Service Providers (CSPs) to >> >> Again, CoCo isn't just for cloud use cases. > > See above. > >> >>> +provide stronger security guarantees to their clients (usually referred to >>> +as tenants) by excluding all the CSP's infrastructure and SW out of the >>> +tenant's Trusted Computing Base (TCB). >> >> This is inaccurate, the provider may still have software and/or hardware in the >> TCB. > > Well, this is the end goal where we want to be, the practical deployment can > differ of course. We can rephrase that it "allows to exclude all the CSP's > infrastructure and SW out of tenant's TCB." > >> >> And for the cloud use case, I very, very strongly object to implying that the goal >> of CoCo is to exclude the CSP from the TCB. Getting out of the TCB is the goal for >> _some_ CSPs, but it is not a fundamental tenant of CoCo. This viewpoint is >> heavily >> tainted by Intel's and AMD's current offerings, which effectively disallow third >> party code for reasons that have nothing to do with security. >> >> https://lore.kernel.org/all/Y+aP8rHr6H3LIf%2Fc@xxxxxxxxxx > > I am not fully sure what you imply with this. Minimal TCB is always a good goal > from security point of view (less hw/sw equals less bugs). From a tenant point > of view of course it is question of risk evaluation: do they think that CSP stack > has a higher chance to have a bug that can be exploited or SW provided by > HW vendors? You seem to imply that some tenants might consider CSP stack to > be more robust? If so, why would they use CoCo? In this case they are better off > with just normal legacy VMs, no? > > >> >>> +While the concrete implementation details differ between technologies, all >>> +of these mechanisms provide increased confidentiality and integrity of CoCo >>> +guest memory and execution state (vCPU registers), more tightly controlled >>> +guest interrupt injection, >> >> This is highly dependent on how "interrupt" is defined, and how "controlled" is >> defined. > > As you know there are some limitations on what type of interrupt vectors can be > injected into a TD guest. Vectors 0-30 are not injectable. This is what is meant by > "more tightly controlled". > >> >>> as well as some additional mechanisms to control guest-host page mapping. >> >> This is flat out wrong for SNP for any reasonable definition of "page mapping". >> SNP has _zero_ "control" over page tables, which is most people think of when >> they >> see "page mapping". > > Leaving for AMD guys to comment. In SNP, the guest controls the association of a guest physical address to a host physical address, so that the host can't switch that through the nested page tables [1]. We will be more specific to avoid interpretations. > >> >>> More details on the x86-specific solutions can be >>> +found in >>> +:doc:`Intel Trust Domain Extensions (TDX) </x86/tdx>` and >>> +:doc:`AMD Memory Encryption </x86/amd-memory-encryption>`. >> >> So by the above definition, vanilla SEV and SEV-ES can't be considered CoCo. SEV >> doesn't provide anything besides increased confidentiality of guest memory, and >> SEV-ES doesn't provide integrity or validation of physical page assignment. >> > > Same > Personally, I think it's reasonable to mention SEV/SEV-ES in the context of confidential computing and acknowledge their relevance in this area. But there is no mention to SEV or SEV-ES in this draft. And the document we reference there covers AMD-SNP, which provides integrity. >>> +The basic CoCo layout includes the host, guest, the interfaces that >>> +communicate guest and host, a platform capable of supporting CoCo, >> >> CoCo VMs... > > Will fix. > >> >>> and an intermediary between the guest virtual machine (VM) and the >>> underlying platform that acts as security manager:: >> >> Having an intermediary is very much an implementation detail. > > True, but it is kind of big component, so completely omitting it doesn’t sound > right to me either. > >> >>> +Confidential Computing threat model and security objectives >>> +=========================================================== >>> + >>> +Confidential Cloud Computing adds a new type of attacker to the above list: >>> +an untrusted and potentially malicious host. >> >> I object to splattering "malicious host" everywhere. Many people are going to >> read this and interpret "host" as "the CSP", and then make assumptions like >> "CoCo assumes the CSP is malicious!". AIUI, the vast majority of use cases aren't >> concerned so much about "the CSP" being malicious, but rather they're >> concerned >> about new attack vectors that come with running code/VMs on a stack that is >> managed by a third party, on hardware that doesn't reside in a secured facility, >> etc. > > I see your point. I propose to add paragraph in the beginning that explains that > CSPs do not intend to be malicious (at least we hope they dont), but since they > have a big codebase to manage, bugs in that codebase are normal and CoCo > helps to protect tenants against this situations. Also change "malicious host" to > "unintentionally misbehaving host" or smth like this. > >> >>> +While the traditional hypervisor has unlimited access to guest data and >>> +can leverage this access to attack the guest, the CoCo systems mitigate >>> +such attacks by adding security features like guest data confidentiality >>> +and integrity protection. This threat model assumes that those features >>> +are available and intact. >> >> Again, if you're claiming integrity is a key tenant, then SEV and SEV-ES can't be >> considered CoCo. Again, nobody mentioned SEV/SEV-ES here. >> >>> +The **Linux kernel CoCo security objectives** can be summarized as follows: >>> + >>> +1. Preserve the confidentiality and integrity of CoCo guest private memory. >> >> So, registers are fair game? > > No, you are right, needs to be augmented here. What we meant here is that > the end goal of the attacker is the tenant secrets and they can also be in registers. > >> >>> +2. Prevent privileged escalation from a host into a CoCo guest Linux kernel. >>> + >>> +The above security objectives result in two primary **Linux kernel CoCo >>> +assets**: >>> + >>> +1. Guest kernel execution context. >>> +2. Guest kernel private memory. >> >> ... >> >>> diff --git a/MAINTAINERS b/MAINTAINERS >>> index 7f86d02cb427..4a16727bf7f9 100644 >>> --- a/MAINTAINERS >>> +++ b/MAINTAINERS >>> @@ -5307,6 +5307,12 @@ S: Orphan >>> W: http://accessrunner.sourceforge.net/ >>> F: drivers/usb/atm/cxacru.c >>> >>> +CONFIDENTIAL COMPUTING THREAT MODEL >> >> This is not generic CoCo documentation, it's specific to CoCo VMs. E.g. SGX is >> most definitely considered a CoCo feature, and it has no dependencies >> whatsoever >> on virtualization. > > Yes, so how we call it? CoCo VM is a term for a running entity. > That's why the academic term Confidential Cloud Computing was used in the > beginning, but you didn’t like it either. > >> >>> +M: Elena Reshetova <elena.reshetova@xxxxxxxxx> >>> +M: Carlos Bilbao <carlos.bilbao@xxxxxxx> >> >> I would love to see an M: or R: entry for someone that is actually _using_ CoCo. > > Would be more than welcomed! > >> >> IMO, this document is way too Intel/AMD centric. > > Anyone is free to comment/participate on writing this and help us to adjust to > even further to the rest of vendors, because for us it is hard to know details and > applicability for other hw vendors. > Adding Rivos guys now explicitly to CC list. > I'm sure we can find a common ground for this document. > > Best Regards, > Elena. Thanks, Carlos [1] https://www.amd.com/system/files/TechDocs/SEV-SNP-strengthening-vm-isolation-with-integrity-protection-and-more.pdf