On Thu, Sep 10, 2020 at 12:12:27PM +0100, Steven Price wrote: > On 10/09/2020 11:52, Catalin Marinas wrote: > > On Thu, Sep 10, 2020 at 11:23:33AM +0100, Steven Price wrote: > > > On 04/09/2020 11:30, Catalin Marinas wrote: > > > > --- /dev/null > > > > +++ b/arch/arm64/lib/mte.S > > > > @@ -0,0 +1,34 @@ > > > > +/* SPDX-License-Identifier: GPL-2.0-only */ > > > > +/* > > > > + * Copyright (C) 2020 ARM Ltd. > > > > + */ > > > > +#include <linux/linkage.h> > > > > + > > > > +#include <asm/assembler.h> > > > > +#include <asm/sysreg.h> > > > > + > > > > + .arch armv8.5-a+memtag > > > > + > > > > +/* > > > > + * multitag_transfer_size - set \reg to the block size that is accessed by the > > > > + * LDGM/STGM instructions. > > > > + */ > > > > + .macro multitag_transfer_size, reg, tmp > > > > + mrs_s \reg, SYS_GMID_EL1 > > > > + ubfx \reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE > > > > + mov \tmp, #4 > > > > + lsl \reg, \tmp, \reg > > > > + .endm > > > > + > > > > +/* > > > > + * Clear the tags in a page > > > > + * x0 - address of the page to be cleared > > > > + */ > > > > +SYM_FUNC_START(mte_clear_page_tags) > > > > + multitag_transfer_size x1, x2 > > > > +1: stgm xzr, [x0] > > > > + add x0, x0, x1 > > > > + tst x0, #(PAGE_SIZE - 1) > > > > + b.ne 1b > > > > + ret > > > > +SYM_FUNC_END(mte_clear_page_tags) > > > > > > Could the value of SYS_GMID_EL1 vary between CPUs and do we therefore need a > > > preempt_disable() around mte_clear_page_tags() (and other functions in later > > > patches)? > > > > If they differ, disabling preemption here is not sufficient. We'd have > > to trap the GMID_EL1 access at EL2 as well and emulate it (we do this > > for CTR_EL0 in dcache_line_size). > > Hmm, good point. It's actually not possible to properly emulate this - EL2 > can trap GMID_EL1 to provide a different (presumably smaller) size, but > LDGM/STGM will still read/store the number of tags of the underlying > hardware. While simple loops like we've got at the moment won't care (we'll > just end up doing useless work), it won't be architecturally correct. The > guest can always deduce the underlying value. So I think we can safely > consider this broken hardware. I think that's similar to the DC ZVA (and DCZID_EL0.BS) case where faking it could lead to data corruption if the software assumes it writes a maximum number of bytes. (I meant to raise a ticket with the architects to make this a requirement in the ARM ARM but forgot about it) > > I don't want to proactively implement this just in case we'll have > > broken hardware (I feel a bit more optimistic today ;)). > > Given the above I think if we do have broken hardware the only sane thing to > do would be to provide a way of overriding multitag_transfer_size to return > the smallest size of all CPUs. Which works well enough for the uses we've > currently got. If we do have such broken hardware, we should probably drop the STGM instructions in favour of STG or ST2G. Luckily, STGM/LDGM are not available in user space. -- Catalin