On 6/27/23 08:14, Michal Hocko wrote:
On Tue 27-06-23 16:57:53, David Hildenbrand wrote:
...
IIUC (John can correct me if I am wrong):
1) The process holds the device node open
2) The process gets killed or quits
3) As the process gets torn down, it closes the device node
4) Closing the device node results in the driver removing the device and
calling offline_and_remove_memory()
So it's not a "tear down process" that triggers that offlining_removal
somehow explicitly, it's just a side-product of it letting go of the device
node as the process gets torn down.
Isn't that just fragile? The operation might fail for other reasons. Why
cannot there be a hold on the resource to control the tear down
explicitly?
I'll let John comment on that. But from what I understood, in most setups
where ZONE_MOVABLE gets used for hotplugged memory
offline_and_remove_memory() succeeds and allows for reusing the device later
without a reboot.
For the cases where it doesn't work, a reboot is required.
That is exactly correct. That's what we ran into.
And there are workarounds (for example: kthreads don't have any signals
pending...), but I did want to follow through here and make -mm aware of the
problem. And see if there is a better way.
...
It seems that offline_and_remove_memory is using a wrong operation then.
If it wants an opportunistic offlining with some sort of policy. Timeout
might be just one policy to use but failure mode or a retry count might
be a better fit for some users. So rather than (ab)using offline_pages,
would be make more sense to extract basic offlining steps and allow
drivers like virtio-mem to reuse them and define their own policy?
...like this, perhaps. Sounds promising!
thanks,
--
John Hubbard
NVIDIA