On 22-Jan-14, at 5:27 PM, Mikulas Patocka wrote:
The majority messages occur because of a binutils bug that was
fixed several
years ago.
There's a non equivalent mapping between the last code page and the
start of
writable
data in almost every application and shared library in Debian 5.
This is fixed
in the current
Debian unstable and Gentoo. So, I recommend updating.
So far I haven't had any problems with Debian 5, so I prefer it to the
constantly changing unstable.
Anyway, the kernel should work with Debian 5 - the only way how to
install
a new parisc system is to install Debian 5 and then switch to
debian-ports.
Actually, Helge has a lifimage that allows a new Debian system to be
setup with debootstrap.
Helge is working toward new installer. There's documentation on wiki
on how to do it.
At the moment, we are stuck with the constantly changing unstable. I
want to say though
that even if your suggestion below works, there are non equivalent
aliases in most Debian 5
applications and libraries. So, even if kernel updates to pages are
done equivalently,
this might cause issues.
When the user aliases are re-enabled, we have the following
situation when
non equivalent
aliases exist:
"All other uses of non-equivalent aliasing (including
simultaneously enabling
multiple non-equivalently
aliased translations where one or more allow for write access) are
prohibited,
and can cause machine
checks or silent data corruption, including data corruption of
unrelated
memory on unrelated pages."
I'm not sure that we handle correctly handle the case where there
are only
equivalent user aliases.
Calling flush_dcache_page() was a step in this direction but
unfortunately
Helge and I have found
a side effect (zombies run by expect in gcc/gdb testsuites). I've
also found
another situation where
non equivalent aliases are generated.
I tend to think message should be a debug message.
Dave
--
John David Anglin dave.anglin@xxxxxxxx
There is another problem - flushing the cache in kmap_atomic doesn't
fix
inequivalent aliasing because there may be other threads on other CPUs
touching that page from userspace simultaneously.
That is the fundamental issue. In part, it may be the assumptions
surrounding how
COW is implemented. I know reverting the kmap part of the change
works better.
In that implementation, copy_user_page() flushes the from page itself.
I know the above is pretty solid as I ran with it for two weeks
without any obvious cache issues.
What led us to the kmap flush is that the aio code reads and writes
the kernel pages.
Is it possible that user access isn't involved there or there's a user
flush before the aio
operation?
In some sense, this would seem to be a Linux core design problem if
access
to shared pages isn't controlled. I imagine various arm variants
would also break
from these issues.
I got an idea that it could be possible to implement kmap_atomic
without
flushing the cache - currently, 64-bit pagetables map 2^41 bytes of
memory. You can hack the kernel tlb handler, so that the addresses
above
2^41 map to the same memory as base kernel space, just shifted by a
few
pages.
Suppose that the following ranges in the kernel address space map to
the
same memory:
0 ... 2^41-1 (the original kernel mapping)
2^41 + 4096 ... 2*2^41 + 4095 (an alias shifted by 4k)
2*2^41 + 8192 ... 3*2^41 + 8191 (an alias shifted by 8k)
3*2^41 + 12288 ... 4*2^41 + 12287 (an alias shifted by 12k)
... etc for all 1024 page aliasings.
1023*2^41 + 4190208 ... 1024*2^41 + 4190207 (an alias shifted by
4M-4k)
Then, kmap_atomic could select a kernel mapping that has the same
cache-equivalence as the existing userspace mapping and simply
return it
to kernelspace without flushing the cache.
This is a very interesting suggestion. I wasn't a aware that the
kernel mapping
could be controlled in this way.
Dave
--
John David Anglin dave.anglin@xxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html