On Mon 11-10-21 11:40:12, David Hildenbrand wrote: > On 11.10.21 11:28, Michal Hocko wrote: > > On Fri 08-10-21 10:17:50, David Hildenbrand wrote: > > > On 08.10.21 08:39, ultrachin@xxxxxxx wrote: > > > > From: chen xiaoguang <xiaoggchen@xxxxxxxxxxx> > > > > > > > > The exit time is long when program allocated big memory and > > > > the most time consuming part is free memory which takes 99.9% > > > > of the total exit time. By using async free we can save 25% of > > > > exit time. > > > > > > > > Signed-off-by: chen xiaoguang <xiaoggchen@xxxxxxxxxxx> > > > > Signed-off-by: zeng jingxiang <linuszeng@xxxxxxxxxxx> > > > > Signed-off-by: lu yihui <yihuilu@xxxxxxxxxxx> > > > > > > I recently discussed with Claudio if it would be possible to tear down the > > > process MM deferred, because for some use cases (secure/encrypted > > > virtualization, very large mmaps) tearing down the page tables is already > > > the much more expensive operation. > > > > > > There is mmdrop_async(), and I wondered if one could reuse that concept when > > > tearing down a process -- I didn't look into feasibility, however, so it's > > > just some very rough idea. > > > > This is not a new problem. Large process tear down can take ages. The > > primary road block has been accounting. This lot of work has to be > > accounted to the proper domain (e.g. cpu cgroup). > > In general, yes. For some setups where admins don't care about that > accounting (e.g., enabled via some magic toggle for large VMs), I guess this > accounting isn't the major roadblock, correct? Right, I would be careful about magic toggles though. Besides there are ways to achive this in the userspace. We used to have a request to help paralleling process exit from a DB vendor and Vlastimil has come up with a clone(CLONE_VM) and madvise(DONT_NEED) from several threads as a "workaround". This would work properly from the accounting POV. Admittedly a bit of an involved approach though. -- Michal Hocko SUSE Labs