Re: Endless loop on execution attempt on non-executable page

Ralf Baechle <ralf@xxxxxxxxxxxxxx> · Thu, 12 May 2016 16:23:06 +0200

On Thu, May 12, 2016 at 03:07:51PM +0200, Florian Weimer wrote:

> On 05/12/2016 02:53 PM, Ralf Baechle wrote:
> > On Thu, May 12, 2016 at 12:46:37PM +0200, Florian Weimer wrote:
> > 
> > > The GCC compile farm has a big-endian 64-bit MIPS box.  The kernel is:
> > > 
> > > Linux erpro8-fsf1 3.14.10-er8mod-00013-ge0fe977 #1 SMP PREEMPT Wed Jan
> > > 14 12:33:22 PST 2015 mips64 GNU/Linux
> > > 
> > > Which is a vendor kernel for the EdgeRouter Pro-8.
> > > 
> > > /proc/cpuinfo reports:
> > > 
> > > system type             : UBNT_E200 (CN6120p1.1-1000-NSP)
> > > machine                 : Unknown
> > > processor               : 0
> > > cpu model               : Cavium Octeon II V0.1
> > > 
> > > While testing W^X (execmod, DEP, NX) stack enforcement, I noticed that once
> > > I try to execute code off a non-executable page, I do not get a signal, but
> > > the code appears to enter an infinite loop.  The generated function starts
> > > with a jump instruction to return to the caller, but instead, the program
> > > counter does not seem to change at all.
> > > 
> > > “si” in GDB also hangs (but can be interrupted with ^C).
> > > 
> > > My test code is here:
> > > 
> > >   https://pagure.io/execmod-tests
> > > 
> > > Is this a kernel bug or an issue with the silicon?
> > 
> > I see the test case uses mprotect to add PROT_EXEC after writing the code
> > to memory.  I don't think mprotect however gives any guarantee that this
> > will make the I-cache coherent with the D-cache, that is that the CPU will
> > actually fetch and execute the instruction that were just written to memory.
> > For that you have to do something architecture specific such as dancing
> > around a fire waving a dead chicken.  Or on MIPS call cacheflush(), see
> > the man page for details.
> 
> There is a fork between the write and the execute.  It is somewhat unlikely
> that that's not a barrier, but it did happen on POWER.
> 
> However, I can successfully execute code without the barrier, so this whole
> thing goes in the wrong direction. :)
> 
> I have added it, just to be on the safe side.
> 
> > For portability sake to some broken processors you should also ensure
> > that a 32 byte cache line is entirely filled with valid instructions by
> > padding the two test instructions with another six no-op (opcode 0).
> 
> Added as well.
> 
> > The test case as it is guarantees this implicitly by using a freshly
> > allocated page but I thought I should mention it.
> 
> There are some tests that don't (the stack variable might be clobbered, for
> example).
> 
> Anyway, neither change fixed things for me.  Given the peculiar “si”
> behavior in GDB, that's not entirely unexpected ...

Thanks for fixing and testing this obvious things.  Now let's look one
or two levels deeper ...

  Ralf