On Mon, Nov 26, 2018 at 03:29:17PM -0800, Hugh Dickins wrote: > khugepaged's collapse_shmem() does almost all of its work, to assemble > the huge new_page from 512 scattered old pages, with the new_page's > refcount frozen to 0 (and refcounts of all old pages so far also frozen > to 0). Including shmem_getpage() to read in any which were out on swap, > memory reclaim if necessary to allocate their intermediate pages, and > copying over all the data from old to new. > > Imagine the frozen refcount as a spinlock held, but without any lock > debugging to highlight the abuse: it's not good, and under serious load > heads into lockups - speculative getters of the page are not expecting > to spin while khugepaged is rescheduled. > > One can get a little further under load by hacking around elsewhere; > but fortunately, freezing the new_page turns out to have been entirely > unnecessary, with no hacks needed elsewhere. > > The huge new_page lock is already held throughout, and guards all its > subpages as they are brought one by one into the page cache tree; and > anything reading the data in that page, without the lock, before it has > been marked PageUptodate, would already be in the wrong. So simply > eliminate the freezing of the new_page. > > Each of the old pages remains frozen with refcount 0 after it has been > replaced by a new_page subpage in the page cache tree, until they are all > unfrozen on success or failure: just as before. They could be unfrozen > sooner, but cause no problem once no longer visible to find_get_entry(), > filemap_map_pages() and other speculative lookups. > > Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx> > Cc: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> > Cc: stable@xxxxxxxxxxxxxxx # 4.8+ Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> -- Kirill A. Shutemov