On Fri, 28 Jul 2023 19:33:31 +0100, Oliver Upton <oliver.upton@xxxxxxxxx> wrote: > > On Fri, Jul 28, 2023 at 09:29:40AM +0100, Marc Zyngier wrote: > > [...] > > > +/* > > + * Bit assignment for the trap controls. We use a 64bit word with the > > + * following layout for each trapped sysreg: > > + * > > + * [9:0] enum trap_group (10 bits) > > + * [13:10] enum fgt_group_id (4 bits) > > + * [19:14] bit number in the FGT register (6 bits) > > + * [20] trap polarity (1 bit) > > + * [62:21] Unused (42 bits) > > + * [63] RES0 - Must be zero, as lost on insertion in the xarray > > + */ > > +union trap_config { > > + u64 val; > > + struct { > > + unsigned long cgt:10; /* Coarse trap id */ > > + unsigned long fgt:4; /* Fing Grained Trap id */ > > + unsigned long bit:6; /* Bit number */ > > + unsigned long pol:1; /* Polarity */ > > + unsigned long unk:42; /* Unknown */ > > + unsigned long mbz:1; /* Must Be Zero */ > > + }; > > +}; > > Correct me if I'm wrong, but I don't think the compiler is going to > whine if any of these bitfields are initialized with a larger value than > can be represented... Do you think some BUILD_BUG_ON() is in order to > ensure that trap_group fits in ::cgt? > > BUILD_BUG_ON(__NR_TRAP_IDS__ >= BIT(10)); Indeed. This might also apply to ::fgt, and I want to add some sanity checks to verify that the whole union isn't larger than a 'void *', as we rely on that. > > > +struct encoding_to_trap_config { > > + const u32 encoding; > > + const u32 end; > > + const union trap_config tc; > > +}; > > + > > +#define SR_RANGE_TRAP(sr_start, sr_end, trap_id) \ > > + { \ > > + .encoding = sr_start, \ > > + .end = sr_end, \ > > + .tc = { \ > > + .cgt = trap_id, \ > > + }, \ > > + } > > + > > +#define SR_TRAP(sr, trap_id) SR_RANGE_TRAP(sr, sr, trap_id) > > + > > +/* > > + * Map encoding to trap bits for exception reported with EC=0x18. > > + * These must only be evaluated when running a nested hypervisor, but > > + * that the current context is not a hypervisor context. When the > > + * trapped access matches one of the trap controls, the exception is > > + * re-injected in the nested hypervisor. > > + */ > > +static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = { > > +}; > > + > > +static DEFINE_XARRAY(sr_forward_xa); > > + > > +static union trap_config get_trap_config(u32 sysreg) > > +{ > > + return (union trap_config) { > > + .val = xa_to_value(xa_load(&sr_forward_xa, sysreg)), > > + }; > > Should we be checking for NULL here? AFAICT, the use of sentinel values > in the trap_group enum would effectively guarantee each trap_config has > a nonzero value. if xa_load() returns NULL, xa_to_value() will still give us a 0, which is an indication of a sysreg that isn't present in the trap configuration. This can happen if we trap something that isn't yet supported in NV, which is quite common. This allows us to use features on the host without having to immediately write the same support for NV guests. But this is obviously a temporary situation. At some point, I'll become a total bastard and demand that people treat NV as a first class citizen. One day ;-). > > > +} > > + > > +void __init populate_nv_trap_config(void) > > +{ > > + for (int i = 0; i < ARRAY_SIZE(encoding_to_cgt); i++) { > > + const struct encoding_to_trap_config *cgt = &encoding_to_cgt[i]; > > + void *prev; > > + > > + prev = xa_store_range(&sr_forward_xa, cgt->encoding, cgt->end, > > + xa_mk_value(cgt->tc.val), GFP_KERNEL); > > + WARN_ON(prev); > > Returning the error here and failing the overall KVM initialization > seems to be the safest option. The WARN is still handy, though. Yeah, this has found a number of bugs initially. Happy to fail the initialisation after we've iterated over all the array (nothing is more annoying than fixing a bunch of errors iteratively). > > > + } > > + > > + kvm_info("nv: %ld coarse grained trap handlers\n", > > + ARRAY_SIZE(encoding_to_cgt)); > > + > > +} > > + > > +static enum trap_behaviour get_behaviour(struct kvm_vcpu *vcpu, > > + const struct trap_bits *tb) > > +{ > > + enum trap_behaviour b = BEHAVE_HANDLE_LOCALLY; > > + u64 val; > > + > > + val = __vcpu_sys_reg(vcpu, tb->index); > > + if ((val & tb->mask) == tb->value) > > + b |= tb->behaviour; > > + > > + return b; > > +} > > + > > +static enum trap_behaviour __do_compute_trap_behaviour(struct kvm_vcpu *vcpu, > > + const enum trap_group id, > > + enum trap_behaviour b) > > +{ > > + switch (id) { > > + const enum trap_group *cgids; > > + > > + case __RESERVED__ ... __MULTIPLE_CONTROL_BITS__ - 1: > > + if (likely(id != __RESERVED__)) > > + b |= get_behaviour(vcpu, &coarse_trap_bits[id]); > > + break; > > + case __MULTIPLE_CONTROL_BITS__ ... __COMPLEX_CONDITIONS__ - 1: > > + /* Yes, this is recursive. Don't do anything stupid. */ > > + cgids = coarse_control_combo[id - __MULTIPLE_CONTROL_BITS__]; > > + for (int i = 0; cgids[i] != __RESERVED__; i++) > > + b |= __do_compute_trap_behaviour(vcpu, cgids[i], b); > > Would it make sense to WARN here if one of the child trap ids was in the > recursive range? This might be needed at some point, but we can probably tighten it for now. Thanks, M. -- Without deviation from the norm, progress is not possible.