On Fri, May 05, 2017 at 09:29:12AM +0200, Jan Kara wrote: > On Thu 04-05-17 13:59:09, Ross Zwisler wrote: > > dax_invalidate_mapping_entry() currently removes DAX exceptional entries > > only if they are clean and unlocked. This is done via: > > > > invalidate_mapping_pages() > > invalidate_exceptional_entry() > > dax_invalidate_mapping_entry() > > > > However, for page cache pages removed in invalidate_mapping_pages() there > > is an additional criteria which is that the page must not be mapped. This > > is noted in the comments above invalidate_mapping_pages() and is checked in > > invalidate_inode_page(). > > > > For DAX entries this means that we can can end up in a situation where a > > DAX exceptional entry, either a huge zero page or a regular DAX entry, > > could end up mapped but without an associated radix tree entry. This is > > inconsistent with the rest of the DAX code and with what happens in the > > page cache case. > > > > We aren't able to unmap the DAX exceptional entry because according to its > > comments invalidate_mapping_pages() isn't allowed to block, and > > unmap_mapping_range() takes a write lock on the mapping->i_mmap_rwsem. > > > > We could potentially do an rmap walk to see if each of the entries actually > > has any active mappings before we remove it, but this might end up being > > very expensive and doesn't currently look to be worth it. > > > > So, just remove dax_invalidate_mapping_entry() and leave the DAX entries in > > the radix tree. > > > > Signed-off-by: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx> > > Fixes: c6dcf52c23d2 ("mm: Invalidate DAX radix tree entries only if appropriate") > > Reported-by: Jan Kara <jack@xxxxxxx> > > Reviewed-by: Jan Kara <jack@xxxxxxx> > > Cc: <stable@xxxxxxxxxxxxxxx> [4.10+] > > Ah, I've just sent out a series which contains these two patches and > another two patches which change the entry locking to fix the last spotted > race... So either just take my last two patches on top of these two or > take my series as a whole. Sounds good. You added a better comment in invalidate_inode_pages2_range(), so let's just use your version of this series.