Re: ❌ FAIL: Test report for kernel 5.4.0-rc2-d6c2c23.cki (stable-next)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 15, 2019 at 04:26:51PM +0100, Catalin Marinas wrote:
> On Mon, Oct 14, 2019 at 10:33:32PM +0100, Will Deacon wrote:
> > On Mon, Oct 14, 2019 at 05:26:51PM +0100, Catalin Marinas wrote:
> > > On Mon, Oct 14, 2019 at 02:54:17PM +0200, Andrey Konovalov wrote:
> > > > What do you think we should do here?
> > > 
> > > It is an ABI break as the man page clearly states that the above case
> > > should return -ENOMEM.
> > 
> > Although I agree with your analysis, I also thought that these sorts of
> > ABI breaks (changes in error codes) were unfortunately common so we
> > shouldn't throw the baby out with the bath water here.
> > 
> > > The options I see:
> > > 
> > > 1. Revert commit 057d3389108e and try again to document that the memory
> > >    syscalls do not support tagged pointers
> > > 
> > > 2. Change untagged_addr() to only 0-extend from bit 55 or leave the
> > >    tag unchanged if bit 55 is 1. We could mask out the tag (0 rather
> > >    than sign-extend) but if we had an mlock test passing ULONG_MASK,
> > >    then we get -ENOMEM instead of -EINVAL
> > > 
> > > 3. Make untagged_addr() depend on the TIF_TAGGED_ADDR bit and we only
> > >    break the ABI for applications opting in to this new ABI. We did look
> > >    at this but the ptrace(PEEK/POKE_DATA) needs a bit more thinking on
> > >    whether we check the ptrace'd process or the debugger flags
> > > 
> > > 4. Leave things as they are, consider the address space 56-bit and
> > >    change the test to not use LONG_MAX on arm64. This needs to be passed
> > >    by the sparc guys since they probably have a similar issue
> > 
> > I'm in favour of (2) or (4) as long as there's also an update to the docs.
> 
> With (4) we'd start differing from other architectures supported by
> Linux. This works if we consider the test to be broken. However, reading
> the mlock man page:
> 
>        EINVAL The result of the addition addr+len was less than addr
>        (e.g., the addition may have resulted in an overflow).
> 
>        ENOMEM Some of the specified address range does not correspond to
>        mapped pages in the address space of the process.
> 
> There is no mention of EINVAL outside the TASK_SIZE, seems to fall more
> within the ENOMEM description above.

Sorry, I was being too vague in my wording. What I was trying to say is I'm
ok with (2) or (4), but either way we need to update our ABI documentation
under Documentation/arm64/. I personally don't think userspace will care
about EINVAL vs ENOMEM because the kernel is already horribly unreliable
when it comes to keeping error codes stable, which is why I think we could
get away with (4). For example, Jan (who reported this issue) wrote this
change to LTP last year for one of the mmap tests:

https://github.com/linux-test-project/ltp/commit/e7bab61882847609be3132a2f0d94f7469ff5d3e

The fact that we have tagging at all already means that we differ from
many other architectures.

> > > It's slightly annoying to find this now. We did run (part of) LTP but I
> > > guess we never run the POSIX conformance tests.
> > 
> > Yes, and this stuff was in linux-next for a while so it's worrying that
> > kernelci didn't spot it either. Hmm.
> 
> For some reason the mlock test was skipped around the time we pushed the
> TBI patches into -next:
> 
> https://qa-reports.linaro.org/lkft/linux-next-oe/tests/ltp-open-posix-tests/mlock_8-1?&page=2

Coincidence?

> Internally I don't think we've configured LTP with
> --with-open-posix-testsuite, so this explains why we missed it.

Ok, hopefully you can fix that now.

> > > My preference is 2 with a quick attempt below. This basically means
> > > clear the tag if it resembles a valid (tagged) user pointer, otherwise
> > > don't touch it (bit 55 set always means an invalid user pointer). Not
> > > sure how the generated code will look like but we could probably do
> > > something better in assembly directly.
> [...]
> > diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> > index b61b50bf68b1..c23c47360664 100644
> > --- a/arch/arm64/include/asm/memory.h
> > +++ b/arch/arm64/include/asm/memory.h
> > @@ -215,12 +215,18 @@ static inline unsigned long kaslr_offset(void)
> >   * up with a tagged userland pointer. Clear the tag to get a sane pointer to
> >   * pass on to access_ok(), for instance.
> >   */
> > -#define untagged_addr(addr)	\
> > +#define __untagged_addr(addr)	\
> >  	((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55))
> >  
> > +#define untagged_addr(addr)	({					\
> > +	u64 __addr = (__force u64)addr;					\
> > +	__addr &= __untagged_addr(__addr);				\
> > +	(__force __typeof__(addr))__addr;				\
> > +})
> > +
> >  #ifdef CONFIG_KASAN_SW_TAGS
> >  #define __tag_shifted(tag)	((u64)(tag) << 56)
> > -#define __tag_reset(addr)	untagged_addr(addr)
> > +#define __tag_reset(addr)	__untagged_addr(addr)
> >  #define __tag_get(addr)		(__u8)((u64)(addr) >> 56)
> >  #else
> >  #define __tag_shifted(tag)	0UL
> 
> This works for me. Szabolcs also suggested just zeroing the top byte but
> we wouldn't catch the overflow EINVAL case above, so I'd rather only
> mask the tag out if it was a user address (i.e. bit 55 is 0).

I'll spin it as a proper patch while we decide whether we want to do
anything.

Will



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux