Re: [PATCH v3 12/14] arm64: realm: Support nonsecure ITS emulation shared

Steven Price <steven.price@xxxxxxx> · Fri, 28 Jun 2024 10:59:26 +0100

On 17/06/2024 04:54, Michael Kelley wrote:
> From: Steven Price <steven.price@xxxxxxx> Sent: Wednesday, June 5, 2024 2:30 AM
>>
>> Within a realm guest the ITS is emulated by the host. This means the
>> allocations must have been made available to the host by a call to
>> set_memory_decrypted(). Introduce an allocation function which performs
>> this extra call.
>>
>> Co-developed-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
>> Signed-off-by: Steven Price <steven.price@xxxxxxx>
>> ---
>> Changes since v2:
>>  * Drop 'shared' from the new its_xxx function names as they are used
>>    for non-realm guests too.
>>  * Don't handle the NUMA_NO_NODE case specially - alloc_pages_node()
>>    should do the right thing.
>>  * Drop a pointless (void *) cast.
>> ---
>>  drivers/irqchip/irq-gic-v3-its.c | 90 ++++++++++++++++++++++++--------
>>  1 file changed, 67 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
>> index 40ebf1726393..ca72f830f4cc 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -18,6 +18,7 @@
>>  #include <linux/irqdomain.h>
>>  #include <linux/list.h>
>>  #include <linux/log2.h>
>> +#include <linux/mem_encrypt.h>
>>  #include <linux/memblock.h>
>>  #include <linux/mm.h>
>>  #include <linux/msi.h>
>> @@ -27,6 +28,7 @@
>>  #include <linux/of_pci.h>
>>  #include <linux/of_platform.h>
>>  #include <linux/percpu.h>
>> +#include <linux/set_memory.h>
>>  #include <linux/slab.h>
>>  #include <linux/syscore_ops.h>
>>
>> @@ -163,6 +165,7 @@ struct its_device {
>>  	struct its_node		*its;
>>  	struct event_lpi_map	event_map;
>>  	void			*itt;
>> +	u32			itt_order;
>>  	u32			nr_ites;
>>  	u32			device_id;
>>  	bool			shared;
>> @@ -198,6 +201,30 @@ static DEFINE_IDA(its_vpeid_ida);
>>  #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
>>  #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + SZ_128K)
>>
>> +static struct page *its_alloc_pages_node(int node, gfp_t gfp,
>> +					 unsigned int order)
>> +{
>> +	struct page *page;
>> +
>> +	page = alloc_pages_node(node, gfp, order);
>> +
>> +	if (page)
>> +		set_memory_decrypted((unsigned long)page_address(page),
>> +				     1 << order);
> 
> There's been considerable discussion on the x86 side about
> what to do when set_memory_decrypted() or set_memory_encrypted()
> fails. The conclusions are:
> 
> 1) set_memory_decrypted()/encrypted() *could* fail due to a
> compromised/malicious host, due to a bug somewhere in the
> software stack, or due to resource constraints (x86 might need to
> split a large page mapping, and need to allocate additional page
> table pages, which could fail).
> 
> 2) The guest memory that was the target of such a failed call could
> be left in an indeterminate state that the guest could not reliably
> undo or correct. The guest's view of the page's decrypted/encrypted
> state might not match the host's view. Therefore, any such guest
> memory must be leaked rather than being used or put back on the
> free list.
> 
> 3) After some discussion, we decided not to panic in such a case.
> Instead, set_memory_decrypted()/encrypted() generates a WARN,
> as well as returns a failure result. The most security conscious
> users could set panic_on_warn=1 in their VMs, and thereby stop
> further operation if there any indication that the transition between
> encrypted and decrypt is suspect. The caller of these functions
> also can take explicit action in the case of a failure.
> 
> It seems like the same guidelines should apply here. On the x86
> side we've also cleaned up cases where the return value isn't
> checked, like here and the use of set_memory_encrypted() below.

Very good points - this code was lacking error handling. I think you are
also right that the correct situation when set_memory_{en,de}crypted()
fails is to WARN() and leak the page. It's something that shouldn't
happen with a well behaving host and it's unclear how to safely recover
the page - so leaking the page is the safest result. And the WARN()
approach gives the user the option as to whether this is fatal via
panic_on_warn.

Thanks,
Steve