On Tue, Oct 18, 2022 at 05:52:18PM +0100, Mark Rutland wrote: > On Tue, Oct 18, 2022 at 03:05:14PM +0100, Will Deacon wrote: > > On Tue, Oct 18, 2022 at 12:06:14PM +0100, Mark Rutland wrote: > > > If the tables are shared, you need broadcast maintenance and ISH barriers here, > > > or you risk the usual issues with asynchronous MMU behaviour. > > > > Can you elaborate a bit, please? What we're trying to do is reserve a page > > of VA space for each CPU, which is only ever accessed explicitly by that > > CPU using a normal memory mapping. The fixmap code therefore just updates > > the relevant leaf entry for the CPU on which we're running and the TLBI > > is there to ensure that the new mapping takes effect. > > > > If another CPU speculatively walks another CPU's fixmap slot, then I agree > > that it could access that page after the slot had been cleared. Although > > I can see theoretical security arguments around avoiding that situation, > > there's a very real performance cost to broadcast invalidation that we > > were hoping to avoid on this fast path. > > The issue is that any CPU could walk any of these entries at any time > for any reason, and without broadcast maintenance we'd be violating the > Break-Before-Make requirements. That permits a number of things, > including "amalgamation", which would permit the CPU to consume some > arbitrary function of the old+new entries. Among other things, that can > permit accesses to entirely bogus physical addresses that weren't in > either entry (e.g. making speculative accesses to arbitrary device > addresses). > > For correctness, you need the maintenance to be broadcast to all PEs > which could observe the old and new entries. Urgh, I had definitely purged that one. Thanks for the refresher. > > Of course, in the likely event that I've purged "the usual issues" from > > my head and we need broadcasting for _correctness_, then we'll just have > > to suck it up! > > As above, I believe you need this for correctness. > > I'm not sure if FEAT_BBM level 2 gives you the necessary properties to > relax this on some HW. I'll seek clarification on this, as that amalgamation text feels a bit over-reaching to me (particularly when combined with the fact that the CPU still has to respect things like the NS bit) and I suspect that we wouldn't see it in practice for this case. But for now, I'll add the broadcasting so we don't block the series. Will