Re: crashes in 4.10 because of "parisc: Enable KASLR"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2017-02-01 3:10 PM, Mikulas Patocka wrote:
I'm not 100% convinced that 4.9 is fully stable and that the patch
is the reason for the crashes you see.
What kind of crashes do you see? Userspace or kernel ?
Userspace crashes. Random crashes or internal errors in gcc when compiling
the kernel. I once had "aptitude" crash.
The userspace crashes are present in 4.8 and 4.9 as well. For example, this build failed due to an OS problem:
https://buildd.debian.org/status/fetch.php?pkg=kdenlive&arch=hppa&ver=16.12.1-2&stamp=1485956026&raw=0

Probably, 10% or more large packages fail to build because of this. Note that this only occurs on machines (e.g., c8000) that only support equivalent aliases. We don't see this on the parisc buildd which has two PA8600 CPUs.

My current theory is the following functions are buggy:

/* vmap range flushes and invalidates.  Architecturally, we don't need
 * the invalidate, because the CPU should refuse to speculate once an
 * area has been flushed, so invalidate is left empty */
static inline void flush_kernel_vmap_range(void *vaddr, int size)
{
        unsigned long start = (unsigned long)vaddr;

        flush_kernel_dcache_range_asm(start, start + size);
}
static inline void invalidate_kernel_vmap_range(void *vaddr, int size)
{
        unsigned long start = (unsigned long)vaddr;
        void *cursor = vaddr;

        for ( ; cursor < vaddr + size; cursor += PAGE_SIZE) {
                struct page *page = vmalloc_to_page(cursor);

                if (test_and_clear_bit(PG_dcache_dirty, &page->flags))
                        flush_kernel_dcache_page(page);
        }
        flush_kernel_dcache_range_asm(start, start + size);
}

The kernel sets up a vmap range for I/O and we have non equivalent aliases to the offset map pages. I know the PG_dcache_dirty is never set when these routines are called, so the for loop does nothing. Nuking the whole data cache appears to fix the application errors but my test was cut short by a second problem. No one else seems to do anything with offset map, so
we might have a parisc specific driver problem.

We also have a down_read/up_read problem where applications stall forever and are not killable (D state in top). Some seemed related to signal processing but they have occurred in other situations as well. They seem more prevalent. For example, I can't remember this happening with 3.18 branch. This problem seems to be triggered by application tests involving multiple
threads (glibc, gcc go and libgomp, and mariadb).

Dave

--
John David Anglin  dave.anglin@xxxxxxxx

--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux