Re: [PATCH v3 05/12] arm64: csum: Disable KASAN for do_csum()

Will Deacon <will@xxxxxxxxxx> · Wed, 15 Apr 2020 20:43:06 +0100

On Wed, Apr 15, 2020 at 08:42:16PM +0200, Arnd Bergmann wrote:
> On Wed, Apr 15, 2020 at 7:28 PM Mark Rutland <mark.rutland@xxxxxxx> wrote:
> > On Wed, Apr 15, 2020 at 05:52:11PM +0100, Will Deacon wrote:
> > > do_csum() over-reads the source buffer and therefore abuses
> > > READ_ONCE_NOCHECK() to avoid tripping up KASAN. In preparation for
> > > READ_ONCE_NOCHECK() becoming a macro, and therefore losing its
> > > '__no_sanitize_address' annotation, just annotate do_csum() explicitly
> > > and fall back to normal loads.
> >
> > I'm confused by this. The whole point of READ_ONCE_NOCHECK() is that it
> > isn't checked by KASAN, so if that semantic is removed it has no reason
> > to exist.
> >
> > Changing that will break the unwind/stacktrace code across multiple
> > architectures. IIRC they use READ_ONCE_NOCHECK() for two reasons:
> >
> > 1. Races with concurrent modification, as might happen when a thread's
> >    stack is corrupted. Allowing the unwinder to bail out after a sanity
> >    check means the resulting report is more useful than a KASAN splat in
> >    the unwinder. I made the arm64 unwinder robust to this case.
> >
> > 2. I believe that the frame record itself /might/ be poisoned by KASAN,
> >    since it's not meant to be an accessible object at the C langauge
> >    level. I could be wrong about this, and would have to check.
> 
> I thought the main reason was deadlocks when a READ_ONCE()
> is called inside of code that is part of the KASAN handling. If
> READ_ONCE() ends up recursively calling itself, the kernel
> tends to crash once it overflows its stack.

That was also my understanding.

> > I would like to keep the unwinding robust in the first case, even if the
> > second case doesn't apply, and I'd prefer to not mark the entirety of
> > the unwinding code as unchecked as that's sufficiently large an subtle
> > that it could have nasty bugs.
> >
> > Is there any way we keep something like READ_ONCE_NOCHECK() around even
> > if we have to give it reduced functionality relative to READ_ONCE()?
> >
> > I'm not enirely sure why READ_ONCE_NOCHECK() had to go, so if there's a
> > particular pain point I'm happy to take a look.
> 
> As I understood, only this particular instance was removed, not all of
> them.

Right, but the problem is that whether the NOCHECK version gets checked
or not now depends on the caller, since it's all just a macro. If we want
to fix this, then we could force the nocheck variant to return unsigned
long, which simplifies things a lot (completely untested):

#define READ_ONCE(x)							\
({									\
	compiletime_assert_rwonce_type(x);				\
	__READ_ONCE_SCALAR(x);						\
})

unsigned long __no_sanitise_address
kasan_nocheck_read_once_ul(const volatile void *p)
{
	return READ_ONCE(*p);
}

/* Please don't use this */
#define READ_ONCE_NOCHECK(x)	kasan_nocheck_read_once_ul(&x)

which would make sense for the unwinders, where there is concurrency
involved, but I'd be inclined to have them call kasan_nocheck_read_once_ul()
directly and ditch READ_ONCE_NOCHECK() so that it doesn't get used for
single-threaded code as a convenience to avoid annotation.

What do you think?

Will