On 3 Jul 2020, at 4:30, Michal Hocko wrote:
On Fri 03-07-20 10:34:09, Catangiu, Adrian Costin wrote:
This patch adds logic to the kernel power code to zero out contents
of
all MADV_WIPEONSUSPEND VMAs present in the system during its
transition
to any suspend state equal or greater/deeper than Suspend-to-memory,
known as S3.
How does the application learn that its memory got wiped? S2disk is an
async operation and it can happen at any time during the task
execution.
So how does the application work to prevent from corrupted state -
e.g.
when suspended between two memory loads?
The usual trick when using MADV_WIPEONFORK, or BSD’s MAP_INHERIT_ZERO,
is to store a guard variable in the page and to check the variable any
time that random data is generated.
Here’s an example of Google’s OpenSSL fork BoringSSL:
https://boringssl.googlesource.com/boringssl/+/ad5582985cc6b89d0e7caf0d9cc7e301de61cf66/crypto/fipsmodule/rand/fork_detect.c
Checking a guard variable for non-zero status will always happen
atomically and monotonically (it won’t suddenly flip back) … which
is all that’s needed in this case. If userspace applications need to
build a larger critical section around they can use regular concurrency
controls, but it really doesn’t come up in this context. With
WIPEONSUSPEND support in a kernel, I expect to add another madvise()
call on the existing page. The manyworldsdetector micro-library is an
example:
https://github.com/colmmacc/manyworldsdetector/blob/master/src/mwd.c
It’d be a new block in the style of lines 43-48.
-
Colm