[PATCHv2 00/28] huge tmpfs implementation using compound pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here is my implementation of huge pages support in tmpfs/shmem. It's more
or less complete. I'm comfortable enough with this to run my workstation.

And it hasn't crashed so far. :)

The main difference with Hugh's approach[1] is that I continue with
compound pages, where Hugh invents new way couple pages: team pages.
I believe THP refcounting rework made team pages unnecessary: compound
page are flexible enough to serve needs of page cache.

Many ideas and some patches were stolen from Hugh's patchset. Having this
patchset around was very helpful.

I will continue with code validation. I would expect mlock require some
more attention.

Please, review and test the code.

Git tree:

git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git hugetmpfs/v2

== Patchset overview ==

[01/28]
	I've posted the patch last night. I stepped on the bug during my
	testing of huge tmpfs, but I think DAX has the same problem, so it
	should be applied now.

[02-05/28]
	These patches also where posted separately. They simplify
	split_huge_page() code with speed trade off. I'm not sure if they
	should go upstream, but they make my life easier for now.
	Patches were slightly adjust to handle file pages too.

[06-11/28]
	Rework fault path and rmap to handle file pmd. Unlike DAX with
	vm_ops->pmd_fault, we don't need to ask filesystem twice -- first
	for huge page and then for small. If ->fault happend to return
	huge page and VMA is suitable for mapping it as huge, we would do
	so.

[12-20/28]
	Various preparation of THP core for file pages.

[21-25/28]
	Various preparation of MM core for file pages.

[26-28/28]
	And finally, bring huge pages into tmpfs/shmem.
	Two of three patches came from Hugh's patchset. :)

[1] http://lkml.kernel.org/g/alpine.LSU.2.11.1502201941340.14414@eggly.anvils

Hugh Dickins (2):
  shmem: prepare huge=N mount option and /proc/sys/vm/shmem_huge
  shmem: get_unmapped_area align huge page

Kirill A. Shutemov (26):
  thp, dax: do not try to withdraw pgtable from non-anon VMA
  rmap: introduce rmap_walk_locked()
  rmap: extend try_to_unmap() to be usable by split_huge_page()
  mm: make remove_migration_ptes() beyond mm/migration.c
  thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers
  mm: do not pass mm_struct into handle_mm_fault
  mm: introduce fault_env
  mm: postpone page table allocation until do_set_pte()
  rmap: support file thp
  mm: introduce do_set_pmd()
  mm, rmap: account file thp pages
  thp, vmstats: add counters for huge file pages
  thp: support file pages in zap_huge_pmd()
  thp: handle file pages in split_huge_pmd()
  thp: handle file COW faults
  thp: handle file pages in mremap()
  thp: skip file huge pmd on copy_huge_pmd()
  thp: prepare change_huge_pmd() for file thp
  thp: run vma_adjust_trans_huge() outside i_mmap_rwsem
  thp: file pages support for split_huge_page()
  vmscan: split file huge pages before paging them out
  page-flags: relax policy for PG_mappedtodisk and PG_reclaim
  radix-tree: implement radix_tree_maybe_preload_order()
  filemap: prepare find and delete operations for huge pages
  truncate: handle file thp
  shmem: add huge pages support

 Documentation/filesystems/Locking |  10 +-
 arch/alpha/mm/fault.c             |   2 +-
 arch/arc/mm/fault.c               |   2 +-
 arch/arm/mm/fault.c               |   2 +-
 arch/arm64/mm/fault.c             |   2 +-
 arch/avr32/mm/fault.c             |   2 +-
 arch/cris/mm/fault.c              |   2 +-
 arch/frv/mm/fault.c               |   2 +-
 arch/hexagon/mm/vm_fault.c        |   2 +-
 arch/ia64/mm/fault.c              |   2 +-
 arch/m32r/mm/fault.c              |   2 +-
 arch/m68k/mm/fault.c              |   2 +-
 arch/metag/mm/fault.c             |   2 +-
 arch/microblaze/mm/fault.c        |   2 +-
 arch/mips/mm/fault.c              |   2 +-
 arch/mn10300/mm/fault.c           |   2 +-
 arch/nios2/mm/fault.c             |   2 +-
 arch/openrisc/mm/fault.c          |   2 +-
 arch/parisc/mm/fault.c            |   2 +-
 arch/powerpc/mm/copro_fault.c     |   2 +-
 arch/powerpc/mm/fault.c           |   2 +-
 arch/s390/mm/fault.c              |   2 +-
 arch/score/mm/fault.c             |   2 +-
 arch/sh/mm/fault.c                |   2 +-
 arch/sparc/mm/fault_32.c          |   4 +-
 arch/sparc/mm/fault_64.c          |   2 +-
 arch/tile/mm/fault.c              |   2 +-
 arch/um/kernel/trap.c             |   2 +-
 arch/unicore32/mm/fault.c         |   2 +-
 arch/x86/mm/fault.c               |   2 +-
 arch/xtensa/mm/fault.c            |   2 +-
 drivers/base/node.c               |  10 +-
 drivers/char/mem.c                |  24 ++
 drivers/iommu/amd_iommu_v2.c      |   2 +-
 drivers/iommu/intel-svm.c         |   2 +-
 fs/proc/meminfo.c                 |   5 +-
 fs/userfaultfd.c                  |  22 +-
 include/linux/huge_mm.h           |  29 +-
 include/linux/mm.h                |  33 +-
 include/linux/mmzone.h            |   3 +-
 include/linux/page-flags.h        |   6 +-
 include/linux/radix-tree.h        |   1 +
 include/linux/rmap.h              |   8 +-
 include/linux/shmem_fs.h          |  18 +-
 include/linux/userfaultfd_k.h     |   8 +-
 include/linux/vm_event_item.h     |   7 +
 ipc/shm.c                         |   6 +-
 kernel/sysctl.c                   |  12 +
 lib/radix-tree.c                  |  70 +++-
 mm/filemap.c                      | 220 +++++++----
 mm/gup.c                          |   7 +-
 mm/huge_memory.c                  | 714 ++++++++++++++--------------------
 mm/internal.h                     |  20 +-
 mm/ksm.c                          |   3 +-
 mm/memory.c                       | 796 +++++++++++++++++++++-----------------
 mm/mempolicy.c                    |   4 +-
 mm/migrate.c                      |  17 +-
 mm/mmap.c                         |  20 +-
 mm/mremap.c                       |  22 +-
 mm/nommu.c                        |   3 +-
 mm/page-writeback.c               |   1 +
 mm/rmap.c                         | 125 ++++--
 mm/shmem.c                        | 493 +++++++++++++++++++----
 mm/swap.c                         |   2 +
 mm/truncate.c                     |  22 +-
 mm/util.c                         |   6 +
 mm/vmscan.c                       |  15 +-
 mm/vmstat.c                       |   3 +
 68 files changed, 1727 insertions(+), 1104 deletions(-)

-- 
2.7.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]