Re: [PATCH v7 000/102] KVM TDX basic feature support

"Nikunj A. Dadhania" <nikunj@xxxxxxx> · Wed, 27 Jul 2022 14:56:40 +0530

On 7/26/2022 8:02 PM, Chao Peng wrote:
> On Mon, Jul 25, 2022 at 07:16:24PM +0530, Nikunj A. Dadhania wrote:
>> On 7/20/2022 8:29 PM, Chao Peng wrote:
>>> On Thu, Jul 14, 2022 at 01:03:46AM +0000, Sean Christopherson wrote:
>>> ...
>>>>
>>>> Option D). track shared regions in an Xarray, update kvm_arch_memory_slot.lpage_info
>>>> on insertion/removal to (dis)allow hugepages as needed.
>>>>
>>>>   + efficient on KVM page fault (no new lookups)
>>>>   + zero memory overhead (assuming KVM has to eat the cost of the Xarray anyways)
>>>>   + straightforward to implement
>>>>   + can (and should) be merged as part of the UPM series
>>>>
>>>> I believe xa_for_each_range() can be used to see if a given 2mb/1gb range is
>>>> completely covered (fully shared) or not covered at all (fully private), but I'm
>>>> not 100% certain that xa_for_each_range() works the way I think it does.
>>>
>>> Hi Sean,
>>>
>>> Below is the implementation to support 2M as you mentioned as option D.
>>> It's based on UPM v7 xarray code: https://lkml.org/lkml/2022/7/6/259
>>>
>>> Everything sounds good, the only trick bit is inc/dec disallow_lpage. If
>>> we still treat it as a count, it will be a challenge to make the inc/dec
>>> balanced. So in this patch I stole a bit for the purpose, looks ugly.
>>>
>>> Any feedback is welcome.
>>>
>>> Thanks,
>>> Chao
>>>
>>> -----------------------------------------------------------------------
>>> From: Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx>
>>> Date: Wed, 20 Jul 2022 11:37:18 +0800
>>> Subject: [PATCH] KVM: Add large page support for private memory
>>>
>>> Update lpage_info when handling KVM_MEMORY_ENCRYPT_{UN,}REG_REGION.
>>>
>>> Reserve a bit in disallow_lpage to indicate a large page has
>>> private/share pages mixed.
>>>
>>> Signed-off-by: Chao Peng <chao.p.peng@xxxxxxxxxxxxxxx>
>>> ---
>>
>>
>>> +static void update_mem_lpage_info(struct kvm *kvm,
>>> +				  struct kvm_memory_slot *slot,
>>> +				  unsigned int attr,
>>> +				  gfn_t start, gfn_t end)
>>> +{
>>> +	unsigned long lpage_start, lpage_end;
>>> +	unsigned long gfn, pages, mask;
>>> +	int level;
>>> +
>>> +	for (level = PG_LEVEL_2M; level <= KVM_MAX_HUGEPAGE_LEVEL; level++) {
>>> +		pages = KVM_PAGES_PER_HPAGE(level);
>>> +		mask = ~(pages - 1);
>>> +		lpage_start = start & mask;
>>> +		lpage_end = end & mask;
>>> +
>>> +		/*
>>> +		 * We only need to scan the head and tail page, for middle pages
>>> +		 * we know they are not mixed.
>>> +		 */
>>> +		update_mixed(lpage_info_slot(lpage_start, slot, level),
>>> +			     mem_attr_is_mixed(kvm, attr, lpage_start,
>>> +							  lpage_start + pages));
>>> +
>>> +		if (lpage_start == lpage_end)
>>> +			return;
>>> +
>>> +		for (gfn = lpage_start + pages; gfn < lpage_end; gfn += pages) {
>>> +			update_mixed(lpage_info_slot(gfn, slot, level), false);
>>> +		}
>>
>> Boundary check missing here for the case when gfn reaches lpage_end.
>>
>> 		if (gfn == lpage_end)
>> 			return;
> 
> In this case, it's actually the tail page that I want to scan for with
> below code.

What if you do not have the tail lpage?

For example: memslot base_gfn = 0x1000 and npages is 0x800, so memslot range
is 0x1000 to 0x17ff.

Assume a case when this function is called with start = 1000 and end = 1800.
For 2M, page mask is 0x1ff. start and end both are 2M aligned.

First update_mixed takes care of 0x1000-0x1200
Loop update_mixed: goes over from 0x1200 - 0x1800, there are no pages left
for last update_mixed to process.

> 
> It's also possible I misunderstand something here.
> 
> Chao
>>
>>> +
>>> +		update_mixed(lpage_info_slot(lpage_end, slot, level),
>>> +			     mem_attr_is_mixed(kvm, attr, lpage_end,
>>> +							  lpage_end + pages));

lpage_info_slot some times causes a crash, as I noticed that
lpage_info_slot() returns out-of-bound index.

Regards
Nikunj