Re: [PATCH v2] kasan: fix deadlock in start_report()

Catalin Marinas <catalin.marinas@xxxxxxx> · Tue, 28 Feb 2023 16:09:08 +0000

On Mon, Feb 27, 2023 at 03:13:45AM +0100, Andrey Konovalov wrote:
> +Catalin, would it be acceptable to implement a routine that disables
> in-kernel MTE tag checking (until the next
> mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE
> fault does this, but without the fault itself. I.e., expose the part
> of do_tag_recovery functionality without report_tag_fault?

I don't think we ever re-enable MTE after do_tag_recovery(). The
mte_enable_kernel_*() are called at boot. We do call
kasan_enable_tagging() explicitly in the kunit tests but that's a
controlled fault environment.

IIUC, the problem is that the kernel already got an MTE fault, so at
that point the error is not really recoverable. If we want to avoid a
fault in the first place, we could do something like
__uaccess_enable_tco() (Vincenzo has some patches to generalise these
routines) but if an MTE fault already triggered and MTE is to stay
disabled after the reporting anyway, I don't think it's worth it.

So I wonder whether it's easier to just disable MTE before calling
report_tag_fault() so that it won't trigger additional faults:

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index f4cb0f85ccf4..1449d2bc6f10 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -329,8 +329,6 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr,
 			   struct pt_regs *regs)
 {
 
-	report_tag_fault(addr, esr, regs);
-
 	/*
 	 * Disable MTE Tag Checking on the local CPU for the current EL.
 	 * It will be done lazily on the other CPUs when they will hit a
@@ -339,6 +337,8 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr,
 	sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF_MASK,
 			 SYS_FIELD_PREP_ENUM(SCTLR_EL1, TCF, NONE));
 	isb();
+
+	report_tag_fault(addr, esr, regs);
 }
 
 static bool is_el1_mte_sync_tag_check_fault(unsigned long esr)

-- 
Catalin