On 10/09/2020 11:52, Catalin Marinas wrote:
On Thu, Sep 10, 2020 at 11:23:33AM +0100, Steven Price wrote:
On 04/09/2020 11:30, Catalin Marinas wrote:
--- /dev/null
+++ b/arch/arm64/lib/mte.S
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 ARM Ltd.
+ */
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+#include <asm/sysreg.h>
+
+ .arch armv8.5-a+memtag
+
+/*
+ * multitag_transfer_size - set \reg to the block size that is accessed by the
+ * LDGM/STGM instructions.
+ */
+ .macro multitag_transfer_size, reg, tmp
+ mrs_s \reg, SYS_GMID_EL1
+ ubfx \reg, \reg, #SYS_GMID_EL1_BS_SHIFT, #SYS_GMID_EL1_BS_SIZE
+ mov \tmp, #4
+ lsl \reg, \tmp, \reg
+ .endm
+
+/*
+ * Clear the tags in a page
+ * x0 - address of the page to be cleared
+ */
+SYM_FUNC_START(mte_clear_page_tags)
+ multitag_transfer_size x1, x2
+1: stgm xzr, [x0]
+ add x0, x0, x1
+ tst x0, #(PAGE_SIZE - 1)
+ b.ne 1b
+ ret
+SYM_FUNC_END(mte_clear_page_tags)
Could the value of SYS_GMID_EL1 vary between CPUs and do we therefore need a
preempt_disable() around mte_clear_page_tags() (and other functions in later
patches)?
If they differ, disabling preemption here is not sufficient. We'd have
to trap the GMID_EL1 access at EL2 as well and emulate it (we do this
for CTR_EL0 in dcache_line_size).
Hmm, good point. It's actually not possible to properly emulate this -
EL2 can trap GMID_EL1 to provide a different (presumably smaller) size,
but LDGM/STGM will still read/store the number of tags of the underlying
hardware. While simple loops like we've got at the moment won't care
(we'll just end up doing useless work), it won't be architecturally
correct. The guest can always deduce the underlying value. So I think we
can safely consider this broken hardware.
I don't want to proactively implement this just in case we'll have
broken hardware (I feel a bit more optimistic today ;)).
Given the above I think if we do have broken hardware the only sane
thing to do would be to provide a way of overriding
multitag_transfer_size to return the smallest size of all CPUs. Which
works well enough for the uses we've currently got.
Steve