In commit c0d0381ade79, changes were made to use i_mmap_rwsem for pmd sharing synchronization. This required changes to mm locking order that are hugetlb specific. Specifically, i_mmap_rwsem must be taken before the page lock. This is not not a huge issue in hugetlb specific code, but becomes more problematic in the areas of page migration and memory failure where generic mm code had to deal with this change to lock ordering. An ugly routine 'hugetlb_page_mapping_lock_write' was added to help with these issues. Recently, Hugh Dickins diagnosed a migration BUG as caused by code introduced with hugetlb i_mmap_rwsem synchronization [1]. Subsequent discussion in that thread pointed out additional problems in the code. Adding a rw_semaphore to the hugetlbfs inode for this type of synchronization was mentioned. Such an approach is actually 'cleaner' as it can be inserted in the lock hierarchy where needed. And, there is no issue with other parts of the mm using this rw_semaphore. This series adds a rw_semaphore (hinode_rwsem) to the hugetlbfs inode. The first patch reverts all commits having to deal with the current use of i_mmap_rwsem for pmd sharing and fault/truncate synchronization. The revert of 5 commits was combined into a single patch. I am looking for feedback on this approach. I considered: - 5 Patches to revert the 5 commits - Reverting patches depending on c0d0381ade79, then having a patch to change from i_mmap_rwsem to hinode_rwsem. To me, a 'clean slate' approach seemed best but I am open to whatever would be easiest to review. Changes in RFC v2 - Added missing locking pointed out by Naoya Horiguchi - Cleaned up some comments as suggested by Naoya Horiguchi - Cleaned up and documented hinode_lock_read() helper and added hinode_lock_write() helper. - Split out addition of hinode_rwsem and helper routines to a separate patch. [1] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/ Mike Kravetz (4): hugetlbfs: revert use of i_mmap_rwsem for pmd sharing and more sync hugetlbfs: add hinode_rwsem to hugetlb specific inode hugetlbfs: use hinode_rwsem for pmd sharing synchronization huegtlbfs: handle page fault/truncate races fs/hugetlbfs/inode.c | 87 +++++++------ include/linux/fs.h | 15 --- include/linux/hugetlb.h | 135 ++++++++++++++++++-- mm/hugetlb.c | 267 ++++++++++++++++------------------------ mm/memory-failure.c | 34 ++--- mm/memory.c | 5 + mm/migrate.c | 34 +++-- mm/rmap.c | 17 +-- mm/userfaultfd.c | 19 +-- 9 files changed, 322 insertions(+), 291 deletions(-) -- 2.25.4