mm: Behavior of process_vm_* with short local buffers

Keno Fischer <keno@xxxxxxxxxxxxxxxxxx> · Sun, 24 May 2020 22:25:17 -0400

Hi everyone,

I'm in the process of trying to port a debugging tool (http://rr-project.org/)
from x86 to various other architectures. This tool relies on noting every
change that was made to the memory of the process being debugged.
As such, it has a battery of tests for corner cases of copyin/out and it
is one of these that I saw behaving strangely when ported to non-x86
architectures. This particular test was testing the behavior of
process_vm_readv (and writev, but for simplicity, let's assume readv here)
with short local buffers.

On x86 if the buffer is short and the following page is unmapped,
the syscall will fill the remainder of the page, and
then return however many bytes it actually wrote. However, on other
architectures (I mostly looked at arm64, though the same applies
elsewhere), the behavior can be quite different.
In general, the behavior depends strongly on factors like how close to
the start of the copy region the page break occurs, how many bytes
were supposed to be left after the page break and the total size of
the region to be copied. In various situations, I'm seeing:

- Writes that end many bytes before the page break
- Bytes being modified beyond what the syscall result would indicate
happened.
- Combinations thereof

I can work around this in my port, but I thought it might be valuable
to ask where the line is between "architecture-defined behavior" and
a bug that should be reported to the appropriate architecture
maintainers and eventually fixed. For example, I think it would be
nice if the syscall result actually did match the actual number of
bytes written in all cases.

I've written a small program [1] that sets up this situation for various
parameter values and prints the results. I have access to arm64,
powerpc and x86, so I included results for those architectures,
but I suspect other architectures have similar issues. The
program should be easy to run to get your own results for
a different architecture.

[1] https://gist.github.com/Keno/b247bca85219c4e3bdde9f7d7ff36c77

Thanks, Keno