On 27.06.23 14:34, Michal Hocko wrote:
On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
Let's check for fatal signals only. That looks cleaner and still keeps
the documented use case for manual user-space triggered memory offlining
working. From Documentation/admin-guide/mm/memory-hotplug.rst:
% timeout $TIMEOUT offline_block | failure_handling
In fact, we even document there: "the offlining context can be terminated
by sending a fatal signal".
We should be fixing documentation instead. This could break users who do
have a SIGALRM signal hander installed.
You mean because timeout will send a SIGALRM, which is not considered
fatal in case a signal handler is installed?
At least the "traditional" tools I am aware of don't set a timeout at
all (crossing fingers that they never end up stuck):
* chmem
* QEMU guest agent
* powerpc-utils
libdaxctl also doesn't seem to implement an easy-to-spot timeout for
memory offlining, but it also doesn't configure SIGALRM.
Of course, that doesn't mean that there isn't somewhere a program that
does that; I merely assume that it would be pretty unlikely to find such
a program.
But no strong opinion: we can also keep it like that, update the doc and
add a comment why this one here is different than most other signal
backoff checks.
Thanks!
--
Cheers,
David / dhildenb