On Tue, Aug 26, 2014 at 09:16:56AM -0400, Joshua Kinard wrote: > On 08/26/2014 08:03, Ralf Baechle wrote: > > On Tue, Aug 26, 2014 at 07:06:56AM -0400, Joshua Kinard wrote: > > > >> o32 userland is the primary on both systems. However, the last SIGILL was > >> under the 64k PAGE_SIZE kernel inside of an n32 chroot compiling the 'boost' > >> package on the Octane, which I restarted that and it's not complained since. > >> Also got SIGILL on the 16k PAGE_SIZE kernel when I booted 16k PAGE_SIZE the > >> first time and ran 'ps'. Subsequent runs of 'ps' didn't reproduce the > >> error. Also saw SIGILLs in the bootlog of the 16k PAGE_SIZE kernel when > >> "rm" was ran once (couldn't reproduce) and when mdadm tried to put one of > >> the arrays back together. Subsequent runs using similar argument lines > >> don't reproduce once I got to a root shell. > >> > >> Being it's a Gentoo install...the o32 userland is pretty fresh. Especially > >> on the Octane, where I literally rebuilt the old userland over 2-3 times > >> just to make sure all the old 5-year cruft was gone. The n32 userland > >> chroot is brand-spanking new. gcc-4.7.x only for now on both, because of > >> PR61538 in gcc. Latest binutils. > >> > >> The O2 is chugging away happily so far in updating a bunch of packages. So > >> I am leaning towards this being another quirk I have to hunt down in the > >> Octane's code again. There isn't much in the Octane-specific code that > >> deals with memory, though -- it seems the higher-level MIPS memory code > >> handles most things just fine. > > > > Can you enable core dumps? I'm wondering about the EPC of the crashed > > process. If it's at a function entry or the beginning of a page that > > might indicate there is an issue with flushing caches after the containing > > page got loaded. Also interesting to know if this possibly happened in a > > signal trampoline or VDSO. > > > > These are just the usual suspects - nothing indicates this case is actually > > related. > > (Missed the reply all on the last one) > > Enabled coredumps and got the 'shash' program to fail a second time (first > program to do so)...so I'll rebuild that with debugging symbols and try to > trip it up again later on. > > Is a core file from a binary w/o debugging of any value? Yes - it will contain registers etc. Just what really matters in this case. We don't need the debug info because we're not interested in debugging the application. Ralf