On 17/06/2024 04:54, Michael Kelley wrote: > From: Steven Price <steven.price@xxxxxxx> Sent: Wednesday, June 5, 2024 2:30 AM >> >> Within a realm guest the ITS is emulated by the host. This means the >> allocations must have been made available to the host by a call to >> set_memory_decrypted(). Introduce an allocation function which performs >> this extra call. >> >> Co-developed-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> >> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx> >> Signed-off-by: Steven Price <steven.price@xxxxxxx> >> --- >> Changes since v2: >> * Drop 'shared' from the new its_xxx function names as they are used >> for non-realm guests too. >> * Don't handle the NUMA_NO_NODE case specially - alloc_pages_node() >> should do the right thing. >> * Drop a pointless (void *) cast. >> --- >> drivers/irqchip/irq-gic-v3-its.c | 90 ++++++++++++++++++++++++-------- >> 1 file changed, 67 insertions(+), 23 deletions(-) >> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c >> index 40ebf1726393..ca72f830f4cc 100644 >> --- a/drivers/irqchip/irq-gic-v3-its.c >> +++ b/drivers/irqchip/irq-gic-v3-its.c >> @@ -18,6 +18,7 @@ >> #include <linux/irqdomain.h> >> #include <linux/list.h> >> #include <linux/log2.h> >> +#include <linux/mem_encrypt.h> >> #include <linux/memblock.h> >> #include <linux/mm.h> >> #include <linux/msi.h> >> @@ -27,6 +28,7 @@ >> #include <linux/of_pci.h> >> #include <linux/of_platform.h> >> #include <linux/percpu.h> >> +#include <linux/set_memory.h> >> #include <linux/slab.h> >> #include <linux/syscore_ops.h> >> >> @@ -163,6 +165,7 @@ struct its_device { >> struct its_node *its; >> struct event_lpi_map event_map; >> void *itt; >> + u32 itt_order; >> u32 nr_ites; >> u32 device_id; >> bool shared; >> @@ -198,6 +201,30 @@ static DEFINE_IDA(its_vpeid_ida); >> #define gic_data_rdist_rd_base() (gic_data_rdist()->rd_base) >> #define gic_data_rdist_vlpi_base() (gic_data_rdist_rd_base() + SZ_128K) >> >> +static struct page *its_alloc_pages_node(int node, gfp_t gfp, >> + unsigned int order) >> +{ >> + struct page *page; >> + >> + page = alloc_pages_node(node, gfp, order); >> + >> + if (page) >> + set_memory_decrypted((unsigned long)page_address(page), >> + 1 << order); > > There's been considerable discussion on the x86 side about > what to do when set_memory_decrypted() or set_memory_encrypted() > fails. The conclusions are: > > 1) set_memory_decrypted()/encrypted() *could* fail due to a > compromised/malicious host, due to a bug somewhere in the > software stack, or due to resource constraints (x86 might need to > split a large page mapping, and need to allocate additional page > table pages, which could fail). > > 2) The guest memory that was the target of such a failed call could > be left in an indeterminate state that the guest could not reliably > undo or correct. The guest's view of the page's decrypted/encrypted > state might not match the host's view. Therefore, any such guest > memory must be leaked rather than being used or put back on the > free list. > > 3) After some discussion, we decided not to panic in such a case. > Instead, set_memory_decrypted()/encrypted() generates a WARN, > as well as returns a failure result. The most security conscious > users could set panic_on_warn=1 in their VMs, and thereby stop > further operation if there any indication that the transition between > encrypted and decrypt is suspect. The caller of these functions > also can take explicit action in the case of a failure. > > It seems like the same guidelines should apply here. On the x86 > side we've also cleaned up cases where the return value isn't > checked, like here and the use of set_memory_encrypted() below. Very good points - this code was lacking error handling. I think you are also right that the correct situation when set_memory_{en,de}crypted() fails is to WARN() and leak the page. It's something that shouldn't happen with a well behaving host and it's unclear how to safely recover the page - so leaking the page is the safest result. And the WARN() approach gives the user the option as to whether this is fatal via panic_on_warn. Thanks, Steve