On Thu, Mar 15, 2018 at 06:47:22PM +0000, Marc Zyngier wrote: > On 15/03/18 17:08, Andrew Jones wrote: > > On Thu, Mar 15, 2018 at 04:17:53PM +0000, Marc Zyngier wrote: > >> On 15/03/18 15:54, Andrew Jones wrote: > >>> On Wed, Mar 14, 2018 at 04:50:48PM +0000, Marc Zyngier wrote: > >>>> static inline void *kvm_get_hyp_vector(void) > >>>> { > >>>> struct bp_hardening_data *data = arm64_get_bp_hardening_data(); > >>>> - void *vect = kvm_ksym_ref(__kvm_hyp_vector); > >>>> + void *vect = kern_hyp_va(kvm_ksym_ref(__kvm_hyp_vector)); > >>>> + int slot = -1; > >>>> > >>>> - if (data->fn) { > >>>> - vect = __bp_harden_hyp_vecs_start + > >>>> - data->hyp_vectors_slot * SZ_2K; > >>>> + if (cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR) && data->fn) { > >>> > >>> Now that I'm trying to consider heterogeneous systems, I'm wondering why > >>> harden BP isn't checked with this_cpu_has_cap() like harden EL2 vectors. > >>> I'm also wondering if it's even possible for CPUs to come up without all > >>> of them having harden EL2 vectors if the boot CPU has it. Won't > >>> verify_local_cpu_errata_workarounds() require it? I'm probably just > >>> getting lost in all the capability stuff... > >> > >> Checking harden BP on this particular CPU will only tell you if this CPU > >> is affected, but won't give you any additional information (i.e. you'd > >> still need to check obtain the stuff pointed to by data). > >> > >> Checking for *all* CPUs in one go has some advantages: it is a static > >> key, which means that unaffected platforms will fly (this is a hot path > >> on VHE), and you can check data if you're affected. > > > > Is there a problem with using the static key to check > > ARM64_HARDEN_EL2_VECTORS below? (Sorry, I'm still confused why we're not > > using the same functions for both.) > > Consider the following case: heterogeneous system with CPU0 that has > HBP, and CPU1 that has HEL. CPU0 is using its own vector slot, and CPU1 > should be using an extra one (it doesn't have a dedicated slot). > > When CPU0 claims its vectors, it will get the "normal" version of the > mapping. And then, if we're using cpus_have_cap, we turn that into an > out-of-line mapping. That's not completely wrong (the system still runs > fine), but that's not the expected behaviour. If we restrict the second > phase to this_cpu_has_cap(), we get the expected CPU0 using the normal > mapping, and CPU1 does the exact opposite (which we also want). Thanks for the additional explanation. I've finally got my head wrapped around this and it looks good to me. And, I see now that my comment about verify_local_cpu_errata_workarounds() only applies to CPUs plugged after boot. The comment in check_local_cpu_capabilities() states clearly each CPU present at boot gets a chance to update cpu_hwcaps - which also clarifies why kvm_map_vectors() *must* use cpus_have_const_cap(), else risk the el2 vector slot not being allocated. Thanks, drew