On Fri, Jan 28, 2022 at 11:18:02AM -0600, Brijesh Singh wrote: > SEV-SNP specification provides the guest a mechanism to communicate with ^ The > the PSP without risk from a malicious hypervisor who wishes to read, alter, > drop or replay the messages sent. The driver uses snp_issue_guest_request() > to issue GHCB SNP_GUEST_REQUEST or SNP_EXT_GUEST_REQUEST NAE events to > submit the request to PSP. > > The PSP requires that all communication should be encrypted using key > specified through the platform_data. > > The userspace can use SNP_GET_REPORT ioctl() to query the guest s/The u/U/ > attestation report. > > See SEV-SNP spec section Guest Messages for more details. > > Signed-off-by: Brijesh Singh <brijesh.singh@xxxxxxx> > --- > Documentation/virt/coco/sevguest.rst | 81 ++++ > drivers/virt/Kconfig | 3 + > drivers/virt/Makefile | 1 + > drivers/virt/coco/sevguest/Kconfig | 12 + > drivers/virt/coco/sevguest/Makefile | 2 + > drivers/virt/coco/sevguest/sevguest.c | 605 ++++++++++++++++++++++++++ > drivers/virt/coco/sevguest/sevguest.h | 98 +++++ > include/uapi/linux/sev-guest.h | 50 +++ > 8 files changed, 852 insertions(+) > create mode 100644 Documentation/virt/coco/sevguest.rst > create mode 100644 drivers/virt/coco/sevguest/Kconfig > create mode 100644 drivers/virt/coco/sevguest/Makefile > create mode 100644 drivers/virt/coco/sevguest/sevguest.c > create mode 100644 drivers/virt/coco/sevguest/sevguest.h > create mode 100644 include/uapi/linux/sev-guest.h > > diff --git a/Documentation/virt/coco/sevguest.rst b/Documentation/virt/coco/sevguest.rst > new file mode 100644 > index 000000000000..47ef3b0821d5 > --- /dev/null > +++ b/Documentation/virt/coco/sevguest.rst Adding documentation is all fine and good but you need to link that file into the TOC somewhere so that it is visible when the documentation gets generated... I guess something like this: diff --git a/Documentation/virt/index.rst b/Documentation/virt/index.rst index edea7fea95a8..40ad0d20032e 100644 --- a/Documentation/virt/index.rst +++ b/Documentation/virt/index.rst @@ -13,6 +13,7 @@ Linux Virtualization Support guest-halt-polling ne_overview acrn/index + coco/sevguest .. only:: html and subproject And once you do, do "make htmldocs" to see whether it complains about some formatting issues or other errors etc. /me goes and does it... Ah, here we go: Documentation/virt/coco/sevguest.rst:48: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:51: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:55: WARNING: Inline emphasis start-string without end-string. Documentation/virt/coco/sevguest.rst:57: WARNING: Definition list ends without a blank line; unexpected unindent. There's something it doesn't like about the struct. Yeah, when I look at the html output, it is all weird and not monospaced... > @@ -0,0 +1,81 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +=================================================================== > +The Definitive SEV Guest API Documentation > +=================================================================== > + > +1. General description > +====================== > + > +The SEV API is a set of ioctls that are used by the guest or hypervisor > +to get or set certain aspect of the SEV virtual machine. The ioctls belong ^ a > +to the following classes: > + > + - Hypervisor ioctls: These query and set global attributes which affect the > + whole SEV firmware. These ioctl are used by platform provision tools. "provisioning" maybe? > + > + - Guest ioctls: These query and set attributes of the SEV virtual machine. > + > +2. API description > +================== > + > +This section describes ioctls that can be used to query or set SEV guests. ^^^^^^^^^^^^^^^ That sounds weird. > +For each ioctl, the following information is provided along with a > +description: > + > + Technology: > + which SEV technology provides this ioctl. sev, sev-es, sev-snp or all. ^^^^^^^^^^^^^^^^^^^^ Marketing is going to scold you - those need to be in all caps. :-P > + > + Type: > + hypervisor or guest. The ioctl can be used inside the guest or the > + hypervisor. > + > + Parameters: > + what parameters are accepted by the ioctl. > + > + Returns: > + the return value. General error numbers (ENOMEM, EINVAL) Those are negative: -ENOMEM, -EINVAL > + are not detailed, but errors with specific meanings are. > + > +The guest ioctl should be issued on a file descriptor of the /dev/sev-guest device. > +The ioctl accepts struct snp_user_guest_request. The input and output structure is > +specified through the req_data and resp_data field respectively. If the ioctl fails > +to execute due to a firmware error, then fw_err code will be set otherwise the > +fw_err will be set to 0xff. fw_err is u64. What does 0xff mean? Everything above the least significant byte is reserved 0? > diff --git a/drivers/virt/coco/sevguest/Kconfig b/drivers/virt/coco/sevguest/Kconfig > new file mode 100644 > index 000000000000..07ab9ec6471c > --- /dev/null > +++ b/drivers/virt/coco/sevguest/Kconfig > @@ -0,0 +1,12 @@ > +config SEV_GUEST > + tristate "AMD SEV Guest driver" > + default y Definitely not. We don't enable drivers by default unless they're ubiquitous. > + depends on AMD_MEM_ENCRYPT && CRYPTO_AEAD2 > + help > + SEV-SNP firmware provides the guest a mechanism to communicate with > + the PSP without risk from a malicious hypervisor who wishes to read, > + alter, drop or replay the messages sent. The driver provides > + userspace interface to communicate with the PSP to request the > + attestation report and more. > + > + If you choose 'M' here, this module will be called sevguest. ... > +static bool is_vmpck_empty(struct snp_guest_dev *snp_dev) > +{ > + char zero_key[VMPCK_KEY_LEN] = {0}; > + > + if (snp_dev->vmpck) > + return memcmp(snp_dev->vmpck, zero_key, VMPCK_KEY_LEN) == 0; return !memcmp(...); > + > + return true; > +} > + > +static void snp_disable_vmpck(struct snp_guest_dev *snp_dev) > +{ > + memzero_explicit(snp_dev->vmpck, VMPCK_KEY_LEN); > + snp_dev->vmpck = NULL; > +} > + > +static inline u64 __snp_get_msg_seqno(struct snp_guest_dev *snp_dev) > +{ > + u64 count; > + > + lockdep_assert_held(&snp_cmd_mutex); > + > + /* Read the current message sequence counter from secrets pages */ > + count = *snp_dev->os_area_msg_seqno; > + > + return count + 1; > +} > + > +/* Return a non-zero on success */ > +static u64 snp_get_msg_seqno(struct snp_guest_dev *snp_dev) > +{ > + u64 count = __snp_get_msg_seqno(snp_dev); > + > + /* > + * The message sequence counter for the SNP guest request is a 64-bit > + * value but the version 2 of GHCB specification defines a 32-bit storage > + * for it. If the counter exceeds the 32-bit value then return zero. > + * The caller should check the return value, but if the caller happens to > + * not check the value and use it, then the firmware treats zero as an > + * invalid number and will fail the message request. > + */ > + if (count >= UINT_MAX) { > + pr_err_ratelimited("SNP guest request message sequence counter overflow\n"); How does error message help the user? Is she supposed to reboot the machine or so? Because it sounds to me like if this goes over 32-bit, this driver stops working. So what resets those sequence numbers? > + return 0; > + } > + > + return count; > +} > + > +static void snp_inc_msg_seqno(struct snp_guest_dev *snp_dev) > +{ > + /* > + * The counter is also incremented by the PSP, so increment it by 2 > + * and save in secrets page. > + */ > + *snp_dev->os_area_msg_seqno += 2; > +} > + > +static inline struct snp_guest_dev *to_snp_dev(struct file *file) > +{ > + struct miscdevice *dev = file->private_data; > + > + return container_of(dev, struct snp_guest_dev, misc); > +} > + > +static struct snp_guest_crypto *init_crypto(struct snp_guest_dev *snp_dev, u8 *key, size_t keylen) > +{ > + struct snp_guest_crypto *crypto; > + > + crypto = kzalloc(sizeof(*crypto), GFP_KERNEL_ACCOUNT); > + if (!crypto) > + return NULL; > + > + crypto->tfm = crypto_alloc_aead("gcm(aes)", 0, 0); > + if (IS_ERR(crypto->tfm)) > + goto e_free; > + > + if (crypto_aead_setkey(crypto->tfm, key, keylen)) > + goto e_free_crypto; > + > + crypto->iv_len = crypto_aead_ivsize(crypto->tfm); > + if (crypto->iv_len < 12) { > + dev_err(snp_dev->dev, "IV length is less than 12.\n"); And? < 12 is bad? Make that error message more user-friendly pls. > + goto e_free_crypto; > + } > + > + crypto->iv = kmalloc(crypto->iv_len, GFP_KERNEL_ACCOUNT); > + if (!crypto->iv) > + goto e_free_crypto; > + > + if (crypto_aead_authsize(crypto->tfm) > MAX_AUTHTAG_LEN) { > + if (crypto_aead_setauthsize(crypto->tfm, MAX_AUTHTAG_LEN)) { > + dev_err(snp_dev->dev, "failed to set authsize to %d\n", MAX_AUTHTAG_LEN); > + goto e_free_crypto; > + } > + } > + > + crypto->a_len = crypto_aead_authsize(crypto->tfm); > + crypto->authtag = kmalloc(crypto->a_len, GFP_KERNEL_ACCOUNT); > + if (!crypto->authtag) > + goto e_free_crypto; > + > + return crypto; > + > +e_free_crypto: > + crypto_free_aead(crypto->tfm); > +e_free: > + kfree(crypto->iv); > + kfree(crypto->authtag); > + kfree(crypto); The order of those free calls needs to be the opposite of the kmallocs above. > + > + return NULL; > +} ... > +static int handle_guest_request(struct snp_guest_dev *snp_dev, u64 exit_code, int msg_ver, > + u8 type, void *req_buf, size_t req_sz, void *resp_buf, > + u32 resp_sz, __u64 *fw_err) > +{ > + unsigned long err; > + u64 seqno; > + int rc; > + > + /* Get message sequence and verify that its a non-zero */ > + seqno = snp_get_msg_seqno(snp_dev); > + if (!seqno) > + return -EIO; > + > + memset(snp_dev->response, 0, sizeof(*snp_dev->response)); sizeof(struct snp_guest_msg) > + > + /* Encrypt the userspace provided payload */ > + rc = enc_payload(snp_dev, seqno, msg_ver, type, req_buf, req_sz); > + if (rc) > + return rc; > + > + /* Call firmware to process the request */ > + rc = snp_issue_guest_request(exit_code, &snp_dev->input, &err); > + if (fw_err) > + *fw_err = err; > + > + if (rc) > + return rc; > + > + rc = verify_and_dec_payload(snp_dev, resp_buf, resp_sz); > + if (rc) { > + /* > + * The verify_and_dec_payload() will fail only if the hypervisor is > + * actively modifying the message header or corrupting the encrypted payload. > + * This hints that hypervisor is acting in a bad faith. Disable the VMPCK so that > + * the key cannot be used for any communication. The key is disabled to ensure > + * that AES-GCM does not use the same IV while encrypting the request payload. > + */ Put that comment over the function call. > + dev_alert(snp_dev->dev, > + "Detected unexpected decode failure, disabling the vmpck_id %d\n", > + vmpck_id); > + snp_disable_vmpck(snp_dev); > + return rc; > + } > + > + /* Increment to new message sequence after payload decryption was successful. */ > + snp_inc_msg_seqno(snp_dev); > + > + return 0; > +} > + > +static int get_report(struct snp_guest_dev *snp_dev, struct snp_guest_request_ioctl *arg) > +{ > + struct snp_guest_crypto *crypto = snp_dev->crypto; > + struct snp_report_req req = {0}; > + struct snp_report_resp *resp; > + int rc, resp_len; lockdep_assert_held(&snp_cmd_mutex); > + if (!arg->req_data || !arg->resp_data) > + return -EINVAL; > + > + /* Copy the request payload from userspace */ > + if (copy_from_user(&req, (void __user *)arg->req_data, sizeof(req))) > + return -EFAULT; > + > + /* > + * The intermediate response buffer is used while decrypting the > + * response payload. Make sure that it has enough space to cover the > + * authtag. > + */ > + resp_len = sizeof(resp->data) + crypto->a_len; > + resp = kzalloc(resp_len, GFP_KERNEL_ACCOUNT); > + if (!resp) > + return -ENOMEM; > + > + /* Issue the command to get the attestation report */ > + rc = handle_guest_request(snp_dev, SVM_VMGEXIT_GUEST_REQUEST, arg->msg_version, > + SNP_MSG_REPORT_REQ, &req, sizeof(req), resp->data, > + resp_len, &arg->fw_err); > + if (rc) > + goto e_free; > + > + /* Copy the response payload to userspace */ > + if (copy_to_user((void __user *)arg->resp_data, resp, sizeof(*resp))) > + rc = -EFAULT; > + > +e_free: > + kfree(resp); > + return rc; > +} > + > +static long snp_guest_ioctl(struct file *file, unsigned int ioctl, unsigned long arg) > +{ > + struct snp_guest_dev *snp_dev = to_snp_dev(file); > + void __user *argp = (void __user *)arg; > + struct snp_guest_request_ioctl input; > + int ret = -ENOTTY; > + > + if (copy_from_user(&input, argp, sizeof(input))) > + return -EFAULT; > + > + input.fw_err = 0xff; > + > + /* Message version must be non-zero */ > + if (!input.msg_version) > + return -EINVAL; > + > + mutex_lock(&snp_cmd_mutex); That mutex probably is to be held while issuing SNP commands but then you hold it here already for... > + > + /* Check if the VMPCK is not empty */ > + if (is_vmpck_empty(snp_dev)) { ... this here which is not really a SNP command issuing. Should that mutex be grabbed only around handle_guest_request() above or is it supposed to protect more stuff? > + dev_err_ratelimited(snp_dev->dev, "VMPCK is disabled\n"); > + mutex_unlock(&snp_cmd_mutex); > + return -ENOTTY; > + } > + > + switch (ioctl) { > + case SNP_GET_REPORT: > + ret = get_report(snp_dev, &input); > + break; > + default: > + break; > + } > + > + mutex_unlock(&snp_cmd_mutex); > + > + if (input.fw_err && copy_to_user(argp, &input, sizeof(input))) > + return -EFAULT; > + > + return ret; > +} > + > +static void free_shared_pages(void *buf, size_t sz) > +{ > + unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; > + > + if (!buf) > + return; > + > + /* If fail to restore the encryption mask then leak it. */ Useless comment - the error message says the same. > + if (WARN_ONCE(set_memory_encrypted((unsigned long)buf, npages), > + "Failed to restore encryption mask (leak it)\n")) > + return; > + > + __free_pages(virt_to_page(buf), get_order(sz)); > +} > + > +static void *alloc_shared_pages(size_t sz) > +{ > + unsigned int npages = PAGE_ALIGN(sz) >> PAGE_SHIFT; > + struct page *page; > + int ret; > + > + page = alloc_pages(GFP_KERNEL_ACCOUNT, get_order(sz)); > + if (IS_ERR(page)) > + return NULL; > + > + ret = set_memory_decrypted((unsigned long)page_address(page), npages); > + if (ret) { > + pr_err("SEV-SNP: failed to mark page shared, ret=%d\n", ret); Use pr_fmt, grep the tree for an example how and drop "SEV-SNP:" - that prefix should be "sevguest:". > + __free_pages(page, get_order(sz)); > + return NULL; > + } > + > + return page_address(page); > +} ... > +static int __init snp_guest_probe(struct platform_device *pdev) > +{ > + struct snp_secrets_page_layout *layout; > + struct snp_guest_platform_data *data; > + struct device *dev = &pdev->dev; > + struct snp_guest_dev *snp_dev; > + struct miscdevice *misc; > + int ret; > + > + if (!dev->platform_data) > + return -ENODEV; > + > + data = (struct snp_guest_platform_data *)dev->platform_data; > + layout = (__force void *)ioremap_encrypted(data->secrets_gpa, PAGE_SIZE); > + if (!layout) > + return -ENODEV; > + > + ret = -ENOMEM; > + snp_dev = devm_kzalloc(&pdev->dev, sizeof(struct snp_guest_dev), GFP_KERNEL); > + if (!snp_dev) > + goto e_fail; > + > + ret = -EINVAL; > + snp_dev->vmpck = get_vmpck(vmpck_id, layout, &snp_dev->os_area_msg_seqno); > + if (!snp_dev->vmpck) { > + dev_err(dev, "invalid vmpck id %d\n", vmpck_id); > + goto e_fail; > + } > + > + /* Verify that VMPCK is not zero. */ > + if (is_vmpck_empty(snp_dev)) { > + dev_err(dev, "vmpck id %d is null\n", vmpck_id); > + goto e_fail; > + } > + > + platform_set_drvdata(pdev, snp_dev); > + snp_dev->dev = dev; > + snp_dev->layout = layout; > + > + /* Allocate the shared page used for the request and response message. */ > + snp_dev->request = alloc_shared_pages(sizeof(struct snp_guest_msg)); > + if (!snp_dev->request) > + goto e_fail; > + > + snp_dev->response = alloc_shared_pages(sizeof(struct snp_guest_msg)); > + if (!snp_dev->response) > + goto e_fail; > + > + ret = -EIO; > + snp_dev->crypto = init_crypto(snp_dev, snp_dev->vmpck, VMPCK_KEY_LEN); > + if (!snp_dev->crypto) > + goto e_fail; > + > + misc = &snp_dev->misc; > + misc->minor = MISC_DYNAMIC_MINOR; > + misc->name = DEVICE_NAME; > + misc->fops = &snp_guest_fops; > + > + /* initial the input address for guest request */ > + snp_dev->input.req_gpa = __pa(snp_dev->request); > + snp_dev->input.resp_gpa = __pa(snp_dev->response); > + > + ret = misc_register(misc); > + if (ret) > + goto e_fail; > + > + dev_info(dev, "Initialized SEV-SNP guest driver (using vmpck_id %d)\n", vmpck_id); > + return 0; > + > +e_fail: > + iounmap(layout); > + free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg)); > + free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg)); The freeing needs to happen in the opposite order of the allocations. Audit all gotos. > + > + return ret; > +} > + > +static int __exit snp_guest_remove(struct platform_device *pdev) > +{ > + struct snp_guest_dev *snp_dev = platform_get_drvdata(pdev); > + > + free_shared_pages(snp_dev->request, sizeof(struct snp_guest_msg)); > + free_shared_pages(snp_dev->response, sizeof(struct snp_guest_msg)); > + deinit_crypto(snp_dev->crypto); > + misc_deregister(&snp_dev->misc); > + > + return 0; > +} > + > +static struct platform_driver snp_guest_driver = { > + .remove = __exit_p(snp_guest_remove), > + .driver = { > + .name = "snp-guest", > + }, > +}; > + > +module_platform_driver_probe(snp_guest_driver, snp_guest_probe); > + > +MODULE_AUTHOR("Brijesh Singh <brijesh.singh@xxxxxxx>"); > +MODULE_LICENSE("GPL"); > +MODULE_VERSION("1.0.0"); > +MODULE_DESCRIPTION("AMD SNP Guest Driver"); > diff --git a/drivers/virt/coco/sevguest/sevguest.h b/drivers/virt/coco/sevguest/sevguest.h > new file mode 100644 > index 000000000000..cfa76cf8a21a > --- /dev/null > +++ b/drivers/virt/coco/sevguest/sevguest.h > @@ -0,0 +1,98 @@ > +/* SPDX-License-Identifier: GPL-2.0-only */ > +/* > + * Copyright (C) 2021 Advanced Micro Devices, Inc. > + * > + * Author: Brijesh Singh <brijesh.singh@xxxxxxx> > + * > + * SEV-SNP API spec is available at https://developer.amd.com/sev > + */ > + > +#ifndef __LINUX_SEVGUEST_H_ I guess VIRT_SEVGUEST_H is better fitting. > +#define __LINUX_SEVGUEST_H_ ... -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette