Re: [PATCH v3 12/14] arm64: realm: Support nonsecure ITS emulation shared

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 05 Jun 2024 16:08:49 +0100,
Steven Price <steven.price@xxxxxxx> wrote:
> 
> Hi Marc,
> 
> On 05/06/2024 14:39, Marc Zyngier wrote:
> > The subject line is... odd. I'd expect something like:
> > 
> > "irqchip/gic-v3-its: Share ITS tables with a non-trusted hypervisor"
> > 
> > because nothing here should be CCA specific.
> 
> Good point - that's a much better subject.
> 
> > On Wed, 05 Jun 2024 10:30:04 +0100,
> > Steven Price <steven.price@xxxxxxx> wrote:
> >>
> >> Within a realm guest the ITS is emulated by the host. This means the
> >> allocations must have been made available to the host by a call to
> >> set_memory_decrypted(). Introduce an allocation function which performs
> >> this extra call.
> > 
> > This doesn't mention that this patch radically changes the allocation
> > of some tables.
> 
> I guess that depends on your definition of radical, see below.

It's election time, I'm all about making bold statements!

[...]

> >> @@ -3334,8 +3365,9 @@ static bool its_alloc_table_entry(struct its_node *its,
> >>  
> >>  	/* Allocate memory for 2nd level table */
> >>  	if (!table[idx]) {
> >> -		page = alloc_pages_node(its->numa_node, GFP_KERNEL | __GFP_ZERO,
> >> -					get_order(baser->psz));
> >> +		page = its_alloc_pages_node(its->numa_node,
> >> +					    GFP_KERNEL | __GFP_ZERO,
> >> +					    get_order(baser->psz));
> >>  		if (!page)
> >>  			return false;
> >>  
> >> @@ -3418,7 +3450,9 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
> >>  	unsigned long *lpi_map = NULL;
> >>  	unsigned long flags;
> >>  	u16 *col_map = NULL;
> >> +	struct page *page;
> >>  	void *itt;
> >> +	int itt_order;
> >>  	int lpi_base;
> >>  	int nr_lpis;
> >>  	int nr_ites;
> >> @@ -3430,7 +3464,6 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
> >>  	if (WARN_ON(!is_power_of_2(nvecs)))
> >>  		nvecs = roundup_pow_of_two(nvecs);
> >>  
> >> -	dev = kzalloc(sizeof(*dev), GFP_KERNEL);
> >>  	/*
> >>  	 * Even if the device wants a single LPI, the ITT must be
> >>  	 * sized as a power of two (and you need at least one bit...).
> >> @@ -3438,7 +3471,16 @@ static struct its_device *its_create_device(struct its_node *its, u32 dev_id,
> >>  	nr_ites = max(2, nvecs);
> >>  	sz = nr_ites * (FIELD_GET(GITS_TYPER_ITT_ENTRY_SIZE, its->typer) + 1);
> >>  	sz = max(sz, ITS_ITT_ALIGN) + ITS_ITT_ALIGN - 1;
> >> -	itt = kzalloc_node(sz, GFP_KERNEL, its->numa_node);
> >> +	itt_order = get_order(sz);
> >> +	page = its_alloc_pages_node(its->numa_node,
> >> +				    GFP_KERNEL | __GFP_ZERO,
> >> +				    itt_order);
> > 
> > So we go from an allocation that was so far measured in *bytes* to
> > something that is now at least a page. Per device. This seems a bit
> > excessive to me, specially when it isn't conditioned on anything and
> > is now imposed on all platforms, including the non-CCA systems (which
> > are exactly 100% of the machines).
> 
> Catalin asked about this in v2:
> https://lore.kernel.org/lkml/c329ae18-2b61-4851-8d6a-9e691a2007c8@xxxxxxx/
> 
> To be honest, I don't have a great handle on how much memory is being
> wasted here. Within the realm guest I was testing this is rounding up an
> otherwise 511 byte allocation to a 4k page, and there are 3 of them.
> Which seems reasonable from a realm guest perspective.

And not that reasonable on a smaller system, such as my own router VM
that has a whole lot of devices and very little memory. Not to mention
that while CCA is stuck with 4k pages (duh!), the world is moving
towards larger pages, meaning that this is wasting even more memory.

> 
> I can see two options to improve here:
> 
> 1. Add a !is_realm_world() check and return to the previous behaviour
> when not running in a realm. It's ugly, and doesn't deal with any other
> potential future memory encryption. cc_platform_has(CC_ATTR_MEM_ENCRYPT)
> might be preferable? But this means no impact to non-realm guests.

No, this is way too ugly, and doesn't help with things like pKVM.

> 
> 2. Use a special (global) memory allocator that does the
> set_memory_decrypted() dance on the pages that it allocates but allows
> packing the allocations. I'm not aware of an existing kernel API for
> this, so it's potentially quite a bit of code. The benefit is that it
> reduces memory consumption in a realm guest, although fragmentation
> still means we're likely to see a (small) growth.
> 
> Any thoughts on what you think would be best?

I would expect that something similar to kmem_cache could be of help,
only with the ability to deal with variable object sizes (in this
case: minimum of 256 bytes, in increments defined by the
implementation, and with a 256 byte alignment).

I don't think the ITS is particularly special here, and we should come
up with something that is generic enough to support sharing of
non-page-sized objects.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux