On Tuesday 20 Jul 2021 at 11:13:31 (+0100), Marc Zyngier wrote: > On Mon, 19 Jul 2021 16:49:13 +0100, > Quentin Perret <qperret@xxxxxxxxxx> wrote: > > > > On Monday 19 Jul 2021 at 15:43:34 (+0100), Marc Zyngier wrote: > > > On Mon, 19 Jul 2021 11:47:29 +0100, > > > Quentin Perret <qperret@xxxxxxxxxx> wrote: > > > > > > > > The hypervisor will soon be in charge of tracking ownership of all > > > > memory pages in the system. The current page-tracking infrastructure at > > > > EL2 only allows binary states: a page is either owned or not by an > > > > entity. But a number of use-cases will require more complex states for > > > > pages that are shared between two entities (host, hypervisor, or guests). > > > > > > > > In preparation for supporting these use-cases, introduce in the KVM > > > > page-table library some infrastructure allowing to tag shared pages > > > > using ignored bits (a.k.a. software bits) in PTEs. > > > > > > > > Signed-off-by: Quentin Perret <qperret@xxxxxxxxxx> > > > > --- > > > > arch/arm64/include/asm/kvm_pgtable.h | 5 +++++ > > > > arch/arm64/kvm/hyp/pgtable.c | 25 +++++++++++++++++++++++++ > > > > 2 files changed, 30 insertions(+) > > > > > > > > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h > > > > index dd72653314c7..f6d3d5c8910d 100644 > > > > --- a/arch/arm64/include/asm/kvm_pgtable.h > > > > +++ b/arch/arm64/include/asm/kvm_pgtable.h > > > > @@ -81,6 +81,8 @@ enum kvm_pgtable_stage2_flags { > > > > * @KVM_PGTABLE_PROT_W: Write permission. > > > > * @KVM_PGTABLE_PROT_R: Read permission. > > > > * @KVM_PGTABLE_PROT_DEVICE: Device attributes. > > > > + * @KVM_PGTABLE_STATE_SHARED: Page shared with another entity. > > > > + * @KVM_PGTABLE_STATE_BORROWED: Page borrowed from another entity. > > > > */ > > > > enum kvm_pgtable_prot { > > > > KVM_PGTABLE_PROT_X = BIT(0), > > > > @@ -88,6 +90,9 @@ enum kvm_pgtable_prot { > > > > KVM_PGTABLE_PROT_R = BIT(2), > > > > > > > > KVM_PGTABLE_PROT_DEVICE = BIT(3), > > > > + > > > > + KVM_PGTABLE_STATE_SHARED = BIT(4), > > > > + KVM_PGTABLE_STATE_BORROWED = BIT(5), > > > > > > I'd rather have some indirection here, as we have other potential > > > users for the SW bits outside of pKVM (see the NV series, which uses > > > some of these SW bits as the backend for TTL-based TLB invalidation). > > > > > > Can we instead only describe the SW bit states in this enum, and let > > > the users map the semantic they require onto that state? See [1] for > > > what I carry in the NV branch. > > > > Works for me -- I just wanted to make sure we don't have users in > > different places that use the same bits without knowing, but no strong > > opinions, so happy to change. > > > > > > }; > > > > > > > > #define KVM_PGTABLE_PROT_RW (KVM_PGTABLE_PROT_R | KVM_PGTABLE_PROT_W) > > > > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c > > > > index 5bdbe7a31551..51598b79dafc 100644 > > > > --- a/arch/arm64/kvm/hyp/pgtable.c > > > > +++ b/arch/arm64/kvm/hyp/pgtable.c > > > > @@ -211,6 +211,29 @@ static kvm_pte_t kvm_init_invalid_leaf_owner(u8 owner_id) > > > > return FIELD_PREP(KVM_INVALID_PTE_OWNER_MASK, owner_id); > > > > } > > > > > > > > +static kvm_pte_t pte_ignored_bit_prot(enum kvm_pgtable_prot prot) > > > > > > Can we call these sw rather than ignored? > > > > Sure. > > > > > > +{ > > > > + kvm_pte_t ignored_bits = 0; > > > > + > > > > + /* > > > > + * Ignored bits 0 and 1 are reserved to track the memory ownership > > > > + * state of each page: > > > > + * 00: The page is owned solely by the page-table owner. > > > > + * 01: The page is owned by the page-table owner, but is shared > > > > + * with another entity. > > > > + * 10: The page is shared with, but not owned by the page-table owner. > > > > + * 11: Reserved for future use (lending). > > > > + */ > > > > + if (prot & KVM_PGTABLE_STATE_SHARED) { > > > > + if (prot & KVM_PGTABLE_STATE_BORROWED) > > > > + ignored_bits |= BIT(1); > > > > + else > > > > + ignored_bits |= BIT(0); > > > > + } > > > > + > > > > + return FIELD_PREP(KVM_PTE_LEAF_ATTR_IGNORED, ignored_bits); > > > > +} > > > > + > > > > static int kvm_pgtable_visitor_cb(struct kvm_pgtable_walk_data *data, u64 addr, > > > > u32 level, kvm_pte_t *ptep, > > > > enum kvm_pgtable_walk_flags flag) > > > > @@ -357,6 +380,7 @@ static int hyp_set_prot_attr(enum kvm_pgtable_prot prot, kvm_pte_t *ptep) > > > > attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_AP, ap); > > > > attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S1_SH, sh); > > > > attr |= KVM_PTE_LEAF_ATTR_LO_S1_AF; > > > > + attr |= pte_ignored_bit_prot(prot); > > > > *ptep = attr; > > > > > > > > return 0; > > > > @@ -558,6 +582,7 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p > > > > > > > > attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh); > > > > attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF; > > > > + attr |= pte_ignored_bit_prot(prot); > > > > *ptep = attr; > > > > > > > > return 0; > > > > > > How about kvm_pgtable_stage2_relax_perms()? > > > > It should leave SW bits untouched, and it really felt like a path were > > we want to change permissions and nothing else. What did you have in > > mind? > > It isn't clear to me that it would not (cannot?) be used to change > other bits, given that it takes an arbitrary 'prot' set. Sure, though it already ignores KVM_PGTABLE_PROT_DEVICE. I guess the thing I find hard to reason about is that kvm_pgtable_stage2_relax_perms() is 'additive'. E.g. it can make a mapping RW if it was RO, but not the other way around. With the current patch-set it wasn't really clear how that should translate to KVM_PGTABLE_STATE_SHARED and such. > If there is > such an intended restriction, we definitely should document it. Ack, that's definitely missing. And in fact I should probably make kvm_pgtable_stage2_relax_perms() return -EINVAL if we're passing prot values it can't handle. Cheers, Quentin _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm