Re: [PATCH 0/2] Avert possible deadlock with splice() and fanotify

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 30-11-23 12:07:23, Amir Goldstein wrote:
> On Thu, Nov 30, 2023 at 10:32 AM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> >
> > On Wed, Nov 29, 2023 at 10:07 PM Amir Goldstein <amir73il@xxxxxxxxx> wrote:
> > >
> > > Christian,
> > >
> > > Josef has helped me see the light and figure out how to avoid the
> > > possible deadlock, which involves:
> > > - splice() from source file in a loop mounted fs to dest file in
> > >   a host fs, where the loop image file is
> > > - fsfreeze on host fs
> > > - write to host fs in context of fanotify permission event handler
> > >   (FAN_ACCESS_PERM) on the splice source file
> > >
> > > The first patch should not be changing any logic.
> > > I only build tested the ceph patch, so hoping to get an
> > > Acked-by/Tested-by from Jeff.
> > >
> > > The second patch rids us of the deadlock by not holding
> > > file_start_write() while reading from splice source file.
> > >
> >
> > OOPS, I missed another corner case:
> > The COPY_FILE_SPLICE fallback of server-side-copy in nfsd/ksmbd
> > needs to use the start-write-safe variant of do_splice_direct(),
> > because in this case src and dst can be on any two fs.
> > Expect an extra patch in v2.
> >
> 
> For the interested, see server-side-copy patch below.
> Pushed to branch start-write-safe [1], but will wait with v2 until
> I get comments on v1.
> 
> Thanks,
> Amir.
> 
> [1] https://github.com/amir73il/linux/commits/start-write-safe
> 
> Author: Amir Goldstein <amir73il@xxxxxxxxx>
> Date:   Thu Nov 30 11:42:50 2023 +0200
> 
>     fs: use do_splice_direct() for nfsd/ksmbd server-side-copy
> 
>     nfsd/ksmbd call vfs_copy_file_range() with flag COPY_FILE_SPLICE to
>     perform kernel copy between two files on any two filesystems.
> 
>     Splicing input file, while holding file_start_write() on the output file
>     which is on a different sb, posses a risk for fanotify related deadlocks.
> 
>     We only need to call splice_file_range() from within the context of
>     ->copy_file_range() filesystem methods with file_start_write() held.
> 
>     To avoid the possible deadlocks, always use do_splice_direct() instead of
>     splice_file_range() for the kernel copy fallback in vfs_copy_file_range()
>     without holding file_start_write().
> 
>     Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> 
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 0bc99f38e623..12583e32aa6d 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -1565,11 +1565,18 @@ ssize_t vfs_copy_file_range(struct file
> *file_in, loff_t pos_in,
>          * and which filesystems do not, that will allow userspace tools to
>          * make consistent desicions w.r.t using copy_file_range().
>          *
> -        * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE.
> +        * We also get here if caller (e.g. nfsd) requested COPY_FILE_SPLICE
> +        * for server-side-copy between any two sb.
> +        *
> +        * In any case, we call do_splice_direct() and not splice_file_range(),
> +        * without file_start_write() held, to avoid possible deadlocks related
> +        * to splicing from input file, while file_start_write() is held on
> +        * the output file on a different sb.
>          */
> -       ret = generic_copy_file_range(file_in, pos_in, file_out, pos_out, len,
> -                                     flags);
> +       file_end_write(file_out);
> 
> +       ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out,
> +                              min_t(size_t, len, MAX_RW_COUNT), 0);
>  done:
>         if (ret > 0) {
>                 fsnotify_access(file_in);
> @@ -1581,8 +1588,6 @@ ssize_t vfs_copy_file_range(struct file
> *file_in, loff_t pos_in,
>         inc_syscr(current);
>         inc_syscw(current);
> 
> -       file_end_write(file_out);
> -

This file_end_write() is also used by the paths using ->copy_file_range()
and ->remap_file_range() so you need to balance those...

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux