On 30/01/2025 05:22, Gavin Shan wrote: > On 12/13/24 1:55 AM, Steven Price wrote: [...] >> diff --git a/arch/arm64/kvm/rme.c b/arch/arm64/kvm/rme.c >> index d4561e368cd5..146ef598a581 100644 >> --- a/arch/arm64/kvm/rme.c >> +++ b/arch/arm64/kvm/rme.c >> @@ -602,6 +602,162 @@ static int fold_rtt(struct realm *realm, >> unsigned long addr, int level) >> return 0; >> } >> +int realm_map_protected(struct realm *realm, >> + unsigned long ipa, >> + kvm_pfn_t pfn, >> + unsigned long map_size, >> + struct kvm_mmu_memory_cache *memcache) >> +{ >> + phys_addr_t phys = __pfn_to_phys(pfn); >> + phys_addr_t rd = virt_to_phys(realm->rd); >> + unsigned long base_ipa = ipa; >> + unsigned long size; >> + int map_level; >> + int ret = 0; >> + >> + if (WARN_ON(!IS_ALIGNED(ipa, map_size))) >> + return -EINVAL; >> + >> + switch (map_size) { >> + case PAGE_SIZE: >> + map_level = 3; >> + break; >> + case RMM_L2_BLOCK_SIZE: >> + map_level = 2; >> + break; >> + default: >> + return -EINVAL; >> + } >> + > > The same block of code, to return the RTT level according to the map > size, has been > used for multiple times. It would be nice to introduce a helper for this. I have some changes to support larger host (and guest) page sizes which rework this code which should help clean this up. >> + if (map_level < RMM_RTT_MAX_LEVEL) { >> + /* >> + * A temporary RTT is needed during the map, precreate it, >> + * however if there is an error (e.g. missing parent tables) >> + * this will be handled below. >> + */ >> + realm_create_rtt_levels(realm, ipa, map_level, >> + RMM_RTT_MAX_LEVEL, memcache); >> + } >> + > > This block of code could be dropped. If the RTTs have been existing, > realm_create_rtt_levels() > doesn't nothing, but several RMI calls are issued. RMI calls aren't > cheap and it can cause > performance lost. As mentioned in the patch 19 this is to prevent the first call to rmi_data_create_unknown() failing when we know we're going to need to create the RTTs. So this should generally reduce the number of RMIs calls not increase. >> + for (size = 0; size < map_size; size += PAGE_SIZE) { >> + if (rmi_granule_delegate(phys)) { >> + /* >> + * It's likely we raced with another VCPU on the same >> + * fault. Assume the other VCPU has handled the fault >> + * and return to the guest. >> + */ >> + return 0; >> + } > > We probably can't bail immediately when error is returned from > rmi_granule_delegate() > because we intend to map a region whose size is 'map_size'. So a > 'continue' instead > of 'return 0' seems correct to me. The logic here is that two (or more) vCPUs have faulted on the same region. So if we get a failure here it means that we're the second vCPU to hit the fault. While we could continue, it's highly likely that the other vCPU will be faulting in the later parts of the region so we expect to hit more delegation failures. Returning to the guest and letting the guest retry the access means: a) The access might now continue because the other vCPU has completed (enough of) the mapping, or b) We fault a second time, but give the other vCPU some more time to progress in creating the mapping (and prevent the other vCPU being tripped up by continued attempts by this vCPU from delegating the pages). "continue" would work, but it's likely to waste CPU time with the two vCPUs fighting each other to complete the mapping. >> + >> + ret = rmi_data_create_unknown(rd, phys, ipa); >> + >> + if (RMI_RETURN_STATUS(ret) == RMI_ERROR_RTT) { >> + /* Create missing RTTs and retry */ >> + int level = RMI_RETURN_INDEX(ret); >> + >> + ret = realm_create_rtt_levels(realm, ipa, level, >> + RMM_RTT_MAX_LEVEL, >> + memcache); >> + WARN_ON(ret); >> + if (ret) >> + goto err_undelegate; > > if (WARN_ON(ret)) ack >> + >> + ret = rmi_data_create_unknown(rd, phys, ipa); >> + } >> + WARN_ON(ret); >> + >> + if (ret) >> + goto err_undelegate; > > if (WARN_ON(ret)) ack >> + >> + phys += PAGE_SIZE; >> + ipa += PAGE_SIZE; >> + } >> + >> + if (map_size == RMM_L2_BLOCK_SIZE) >> + ret = fold_rtt(realm, base_ipa, map_level); >> + if (WARN_ON(ret)) >> + goto err; >> + > > The nested if statements are needed here because the WARN_ON() only > take effect on the return value from fold_rtt(). > > if (map_size == RMM_L2_BLOCK_SIZE) { > ret = fold_rtt(realm, base_ipa, map_level); > if (WARN_ON(ret)) > goto err; > } Technically to get here then ret==0 so there's no need to nest like this. But I agree it makes the intent much clearer so I'll update. Thanks, Steve