On Tue, Nov 20, 2018 at 03:30:29PM +0000, Alex Bennée wrote: > > Dave Martin <Dave.Martin@xxxxxxx> writes: [...] > >> Put calculating guest_has_sve at the top of __hyp_switch_fpsimd make > >> most of that go away and just moves things around a little bit. So I > >> guess it could makes sense for the fast(ish) path although I'd be > >> interested in knowing if it made any real difference to the numbers. > >> After all the first read should be well cached and moving it through the > >> stack is just additional memory and register pressure. > > > > Hmmm, I will have a think about this when I respin. > > > > Explicitly caching guest_has_sve() does reduce the compiler's freedom to > > optimise. > > > > We might be able to mark it as __pure or __attribute_const__ to enable > > the compiler to decide whether to cache the result, but this may not be > > 100% safe. > > > > Part of me would prefer to leave things as they are to avoid the risk of > > breaking the code again... > > Given that the only place you call __hyp_switch_fpsimd is here you could > just roll in into __hyp_trap_is_fpsimd and have: > > if (__hyp_trap_is_fpsimd(vcpu)) > return true; Possibly, though the function should be renamed in this case, something like __hyp_handle_fpsimd_trap() I guess. Cheers ---Dave _______________________________________________ kvmarm mailing list kvmarm@xxxxxxxxxxxxxxxxxxxxx https://lists.cs.columbia.edu/mailman/listinfo/kvmarm