On Tue 27-08-19 11:32:15, Kirill A. Shutemov wrote: > On Tue, Aug 27, 2019 at 07:59:41AM +0200, Michal Hocko wrote: > > > > > > IIUC deferred splitting is mostly a workaround for nasty locking issues > > > > > > during splitting, right? This is not really an optimization to cache > > > > > > THPs for reuse or something like that. What is the reason this is not > > > > > > done from a worker context? At least THPs which would be freed > > > > > > completely sound like a good candidate for kworker tear down, no? > > > > > Yes, deferred split THP was introduced to avoid locking issues according to > > > > > the document. Memcg awareness would help to trigger the shrinker more often. > > > > > > > > > > I think it could be done in a worker context, but when to trigger to worker > > > > > is a subtle problem. > > > > Why? What is the problem to trigger it after unmap of a batch worth of > > > > THPs? > > > > > > This leads to another question, how many THPs are "a batch of worth"? > > > > Some arbitrary reasonable number. Few dozens of THPs waiting for split > > are no big deal. Going into GB as you pointed out above is definitely a > > problem. > > This will not work if these GBs worth of THPs are pinned (like with > RDMA). Yes, but this is the case we cannot do anything about in any deferred scheme unless we hood into unpinning call path. We might get there eventually with the newly forming api. > We can kick the deferred split each N calls of deferred_split_huge_page() > if more than M pages queued or something. Yes, that sounds reasonable to me. N can be few dozens of THPs. An explicit flush API after unmap is done would be helpful as well. > Do we want to kick it again after some time if split from deferred queue > has failed? I wouldn't mind to have reclaim path do the fallback and see how that -- Michal Hocko SUSE Labs