Re: [PATCH 4/8] mm, dax: truncate dax mappings at bdev or fs shutdown

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Subject: mm, dax: unmap dax mappings at bdev or fs shutdown
> From: Dan Williams <dan.j.williams@xxxxxxxxx>
> 
> Currently dax mappings leak past / survive block_device shutdown.  While
> page cache pages are permitted to be read/written after the block_device
> is torn down this is not acceptable in the dax case as all media access
> must end when the device is disabled.  The pfn backing a dax mapping is
> permitted to be invalidated after bdev shutdown and this is indeed the
> case with brd.
> 
> When a dax capable block_device driver calls del_gendisk() in its
> shutdown path del_gendisk() needs to ensure that all DAX pfns are
> unmapped.  This is different than the pagecache backed case where the
> disk is protected by the queue being torn down which ends I/O to the
> device.  Since dax bypasses the page cache we need to unconditionally
> unmap the inode.
...
> +static void unmap_list(struct list_head *head)
> +{
> +	struct inode *inode, *_i;
> +
> +	list_for_each_entry_safe(inode, _i, head, i_lru) {
> +		list_del_init(&inode->i_lru);
> +		unmap_mapping_range(&inode->i_data, 0, 0, 1);
> +		iput(inode);
> +		cond_resched();
> +	}
> +}
> +
>  /**
>   * evict_inodes	- evict all evictable inodes for a superblock
>   * @sb:		superblock to operate on
> @@ -642,6 +654,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
>  	int busy = 0;
>  	struct inode *inode, *next;
>  	LIST_HEAD(dispose);
> +	LIST_HEAD(unmap);
>  
>  	spin_lock(&sb->s_inode_list_lock);
>  	list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
> @@ -655,6 +668,19 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
>  			busy = 1;
>  			continue;
>  		}
> +		if (IS_DAX(inode) && atomic_read(&inode->i_count)) {
> +			/*
> +			 * dax mappings can't live past this invalidation event
> +			 * as there is no page cache present to allow the data
> +			 * to remain accessible.
> +			 */
> +			__iget(inode);
> +			inode_lru_list_del(inode);
> +			spin_unlock(&inode->i_lock);
> +			list_add(&inode->i_lru, &unmap);
> +			busy = 1;
> +			continue;
> +		}

I'm somewhat concerned about the abuse of i_lru list here. The inodes have
i_count > 0 and so the code generally assumes such inodes should be removed
from LRU list if they are there. Now I didn't find a place where this could
really hit you but it is fragile. And when that happens, you have a
corruption of your unmap list (since you access it without any locks) and
also you will not unmap your inode.

Also are you sure that your unmapping cannot race with other process
mapping the pfns again?

BTW what happens when you access a DAX pfn that got removed?

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux