Hi Marc, On 2/13/24 15:53, Marc Zyngier wrote: > Hey Eric, > > On Tue, 13 Feb 2024 13:59:31 +0000, > Eric Auger <eauger@xxxxxxxxxx> wrote: >> >> Hi, >> >> On 9/19/23 19:50, Jing Zhang wrote: >>> Add some basic documentation on how to get feature ID register writable >>> masks from userspace. >>> >>> Signed-off-by: Jing Zhang <jingzhangos@xxxxxxxxxx> >>> --- >>> Documentation/virt/kvm/api.rst | 42 ++++++++++++++++++++++++++++++++++ >>> 1 file changed, 42 insertions(+) >>> >>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst >>> index 21a7578142a1..2defb5e198ce 100644 >>> --- a/Documentation/virt/kvm/api.rst >>> +++ b/Documentation/virt/kvm/api.rst >>> @@ -6070,6 +6070,48 @@ writes to the CNTVCT_EL0 and CNTPCT_EL0 registers using the SET_ONE_REG >>> interface. No error will be returned, but the resulting offset will not be >>> applied. >>> >>> +4.139 KVM_ARM_GET_REG_WRITABLE_MASKS >>> +------------------------------------------- >>> + >>> +:Capability: KVM_CAP_ARM_SUPPORTED_FEATURE_ID_RANGES >>> +:Architectures: arm64 >>> +:Type: vm ioctl >>> +:Parameters: struct reg_mask_range (in/out) >>> +:Returns: 0 on success, < 0 on error >>> + >>> + >>> +:: >>> + >>> + #define ARM64_FEATURE_ID_SPACE_SIZE (3 * 8 * 8) >>> + #define ARM64_FEATURE_ID_RANGE_IDREGS BIT(0) >>> + >>> + struct reg_mask_range { >>> + __u64 addr; /* Pointer to mask array */ >>> + __u32 range; /* Requested range */ >>> + __u32 reserved[13]; >>> + }; >>> + >>> +This ioctl copies the writable masks for Feature ID registers to userspace. >>> +The Feature ID space is defined as the AArch64 System register space with >>> +op0==3, op1=={0, 1, 3}, CRn==0, CRm=={0-7}, op2=={0-7}. >> when attempting a migration between Ampere Altra and ThunderXv2 the >> first hurdle is to handle a failure when writing ICC_CTLR_EL1 >> (3.0.12.12.4) on dest. This reg is outside of the scope of the above >> single range (BIT(0)). > > Indeed. But more importantly, this isn't really an ID register. Plenty > of variable bits in there. > >> This may be questionable if we want to migrate between those types of >> machines but the goal is to exercise different scenarios to have a >> gloval view of the problems. > > I think this is a valuable experiment, and we should definitely > explore this sort of things (as I cannot see the diversity of ARM > system slowing down any time soon). > >> >> This reg exposes some RO capabilities such as ExtRange, A3V, SEIS,hyp/nvhe/hyp-main.c >> IDBits, ... >> So to get the migration going further I would need to tweek this on the >> source - for instance I guess SEIS could be reset despite the host HW >> cap - without making too much trouble. > > I'm not sure SEIS is such an easy one: if you promised the guest that > it would never get an SError doing the most stupid things (SEIS=0), it > really shouldn't get one after migration. If you advertised it on the > source HW (Altra), a migration to TX2 would be fine. I see. Indeed for sure we must make sure KVM cannot inject SEIs in the guest if SEIS is not advertised on guest side. In this case SEIS is 0 on src Altra. On dst TX2 ich_vtr_el2 advertises it and host seis =1 causing set_gic_ctlr to fail (vgic-sys-reg-v3.c). > > The other bits are possible to change depending on the requirements of > the VM (aff3, IDBits), and ExtRange should always be set to 0 (because > our GIC implementation doesn't support EPPI/ESPI). > > The really ugly part here is that if you want to affect these bits, > you will need to trap and emulate the access. Not a big deal, but in > the absence of FGT, you will need to handle the full Common trap > group, which is going to slow things down (you will have to trap > ICV_PMR_EL1, for example). Would it be sensible to simplify things and only support the new range if FGT is supported? > >> What would you recommend, adding a new range? But I guess we need to >> design ranges carefully otherwise we may be quickly limited by the >> number of flag bits. > > I can see a need to adding a range that would cover non-ID registers > that have RO fields. But we also need to consider the case of EL2 > registers that take part in this. Yes I need to better understand the consistency with ICH_VTR_EL2 > > For example, ICV_CTLR_EL1 and ICH_VTR_EL2 and deeply linked, and share > some fields. Without NV, no need to expose HCR_VTR_EL2. With NV, this > register actually drives ICV_CTLR_EL1. understood. > > So careful planning is required here, and the impact of this measured. Understood. More time needed to study the code ;-) Thanks! Eric > > Thanks, > > M. >