Re: [PATCH] CHROMIUM: ecryptfs: sync before truncating lower inode

Tyler Hicks <tyhicks@xxxxxxxxxxxxx> · Thu, 20 Apr 2017 18:27:52 -0500

On 04/18/2017 06:36 PM, Andrey Pronin wrote:
> If the updated ecryptfs header data is not written to disk before
> the lower file is truncated, a crash may leave the filesystem
> in the state when the lower file truncation is journaled, while
> the changes to the ecryptfs header are lost (if the underlying
> filesystem is ext4 in data=ordered mode, for example). As a result,
> upon remounting and repairing the file may have a pre-truncation
> length and garbage data after the post-truncation end.
> 
> To reproduce, make a snapshot of the underlying ext4 filesystem
> mounted with data=ordered while asynchronously truncating to zero a
> group of files in ecryptfs mounted on top. Mount ecryptfs for the
> snapshot and check the contents of the group of files that was
> being truncated. The following script reproduces it in almost 100%
> of runs:
> 
> cd /tmp
> mkdir -p ./loop
> dd if=/dev/zero of=./file.img bs=1M count=10
> PW=secret
> 
> LOOPDEV=`losetup --find --show ./file.img`
> mkfs -t ext4 $LOOPDEV
> mount -t ext4 -o rw,nodev,relatime,seclabel,commit=600,data=ordered\
>  $LOOPDEV ./loop
> mkdir -p ./loop/vault ./loop/mount
> mount -t ecryptfs -o rw,relatime,seclabel,ecryptfs_cipher=aes,\
> ecryptfs_key_bytes=16,ecryptfs_unlink_sigs,ecryptfs_passthrough=no,\
> ecryptfs_enable_filename_crypto=no,passphrase_passwd="$PW",no_sig_cache\
>  ./loop/vault ./loop/mount
> for i in `seq 1 100`; do echo $i > ./loop/mount/test.$i; done
> sync
> for i in `seq 100 -1 1`; do truncate -s 0 ./loop/mount/test.$i; done &
> sleep 0.1; sync; cp ./file.img ./file.snap; sleep 1
> umount ./loop/mount
> umount ./loop
> losetup -d $LOOPDEV
> 
> LOOPDEV=`losetup --find --show ./file.snap`
> mount -t ext4 -o rw,nodev,relatime,seclabel,commit=600,data=ordered\
>  $LOOPDEV ./loop
> mount -t ecryptfs -o rw,relatime,seclabel,ecryptfs_cipher=aes,\
> ecryptfs_key_bytes=16,ecryptfs_unlink_sigs,ecryptfs_passthrough=no,\
> ecryptfs_enable_filename_crypto=no,passphrase_passwd="$PW",no_sig_cache\
>  ./loop/vault ./loop/mount
> for i in `seq 1 100`; do
>   if [ `stat -c %s ./loop/mount/test.$i` != 0 ] &&
>      [ `cat ./loop/mount/test.$i` != $i ]; then
>        echo -n "!!! garbage at $i: ";  cat ./loop/mount/test.$i; echo
>   fi
> done
> umount ./loop/mount
> umount ./loop
> losetup -d $LOOPDEV
> 
> Signed-off-by: Andrey Pronin <apronin@xxxxxxxxxxxx>
> ---

Hi Andrey - Thanks for the patch and for the test case. I was able to
reproduce the bug using the test case. I have some comments below.

>  fs/ecryptfs/ecryptfs_kernel.h |  1 +
>  fs/ecryptfs/inode.c           |  6 ++++++
>  fs/ecryptfs/read_write.c      | 22 ++++++++++++++++++++++
>  3 files changed, 29 insertions(+)
> 
> diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
> index f622a733f7ad..567698421270 100644
> --- a/fs/ecryptfs/ecryptfs_kernel.h
> +++ b/fs/ecryptfs/ecryptfs_kernel.h
> @@ -689,6 +689,7 @@ int ecryptfs_read_lower_page_segment(struct page *page_for_ecryptfs,
>  				     pgoff_t page_index,
>  				     size_t offset_in_page, size_t size,
>  				     struct inode *ecryptfs_inode);
> +int ecryptfs_fsync_lower(struct inode *ecryptfs_inode, int datasync);
>  struct page *ecryptfs_get_locked_page(struct inode *inode, loff_t index);
>  int ecryptfs_parse_packet_length(unsigned char *data, size_t *size,
>  				 size_t *length_size);
> diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
> index 5eab400e2590..e7eb8ea154d2 100644
> --- a/fs/ecryptfs/inode.c
> +++ b/fs/ecryptfs/inode.c
> @@ -827,6 +827,12 @@ static int truncate_upper(struct dentry *dentry, struct iattr *ia,
>  			       "rc = [%d]\n", rc);
>  			goto out;
>  		}
> +		rc = ecryptfs_fsync_lower(inode, 0);

Wouldn't we want datasync to be true in this situation?

I am also wondering if it'd be best to sync from inside
ecryptfs_write_inode_size_to_metadata() itself. Your test case shows
that the code path when truncating an inode size to zero is affected
but, from what I can tell, the code path when increasing an inode size
should also be affected:

 truncate_upper -> ecryptfs_write() ->
  ecryptfs_write_inode_size_to_metdata()

Did you consider/test doing that?

Thanks again!

Tyler

> +		if (rc) {
> +			printk(KERN_WARNING "Problem with ecryptfs_fsync_lower,"
> +			       "continue without syncing; "
> +			       "rc = [%d]\n", rc);
> +		}
>  		/* We are reducing the size of the ecryptfs file, and need to
>  		 * know if we need to reduce the size of the lower file. */
>  		lower_size_before_truncate =
> diff --git a/fs/ecryptfs/read_write.c b/fs/ecryptfs/read_write.c
> index 09fe622274e4..ba2dd6263875 100644
> --- a/fs/ecryptfs/read_write.c
> +++ b/fs/ecryptfs/read_write.c
> @@ -271,3 +271,25 @@ int ecryptfs_read_lower_page_segment(struct page *page_for_ecryptfs,
>  	flush_dcache_page(page_for_ecryptfs);
>  	return rc;
>  }
> +
> +/**
> + * ecryptfs_fsync_lower
> + * @ecryptfs_inode: The eCryptfs inode
> + * @datasync: Only perform a fdatasync operation
> + *
> + * Write back data and metadata for the lower file to disk.  If @datasync is
> + * set only metadata needed to access modified file data is written.
> + *
> + * Returns 0 on success; less than zero on error
> + */
> +int ecryptfs_fsync_lower(struct inode *ecryptfs_inode, int datasync)
> +{
> +	struct file *lower_file;
> +
> +	lower_file = ecryptfs_inode_to_private(ecryptfs_inode)->lower_file;
> +	if (!lower_file)
> +		return -EIO;
> +	if (!lower_file->f_op->fsync)
> +		return 0;
> +	return vfs_fsync(lower_file, datasync);
> +}
> 

Attachment:
signature.asc

Description: OpenPGP digital signature