Re: [PATCH 12/15] fs: move kiocb_start_write() into vfs_iocb_iter_write()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 17, 2023 at 9:46 PM Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 14, 2023 at 05:32:51PM +0200, Amir Goldstein wrote:
> > In vfs code, sb_start_write() is usually called after the permission hook
> > in rw_verify_area().  vfs_iocb_iter_write() is an exception to this rule,
> > where kiocb_start_write() is called by its callers.
> >
> > Move kiocb_start_write() from the callers into vfs_iocb_iter_write()
> > after the rw_verify_area() checks, to make them "start-write-safe".
> >
> > This is needed for fanotify "pre content" events.
> >
> > Suggested-by: Jan Kara <jack@xxxxxxx>
> > Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> > ---
> >  fs/cachefiles/io.c  | 2 --
> >  fs/overlayfs/file.c | 1 -
> >  fs/read_write.c     | 2 ++
> >  3 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/cachefiles/io.c b/fs/cachefiles/io.c
> > index 009d23cd435b..3d3667807636 100644
> > --- a/fs/cachefiles/io.c
> > +++ b/fs/cachefiles/io.c
> > @@ -319,8 +319,6 @@ int __cachefiles_write(struct cachefiles_object *object,
> >               ki->iocb.ki_complete = cachefiles_write_complete;
> >       atomic_long_add(ki->b_writing, &cache->b_writing);
> >
> > -     kiocb_start_write(&ki->iocb);
> > -
>
> This bit is subtly wrong, there's a little bit below that does
>
>         ret = cachefiles_inject_write_error();
>         if (ret == 0)
>                 ret = vfs_iocb_iter_write(file, &ki->iocb, iter);
>
> If cachefiles_inject_write_error() returns non-zero it'll fallthrough below and
> call cachefiles_write_complete() and complete the kiocb that hasn't started yet.

Ouch! good catch!

My initial proposal for kiocb_{start,end}_write() has an internal
IOCB_WRITE_STARTED flag to protect against this sort of error,
but Jens convinced me to remove it.

In io_uring and aio.c, IOCB_WRITE is set only after kiocb_start_write(),
so kiocb_end_write() is effectively gated by the IOCB_WRITE flag.
I could make this change in cachefiles/io.c to use IOCB_WRITE flag
as an indication for kiocb_start_write().

It's a bit ugly, but I can't think of anything better and maybe for the purpose
of error injection, the hack is not so bad?

Thanks,
Amir.

--- a/fs/cachefiles/io.c
+++ b/fs/cachefiles/io.c
@@ -259,7 +259,8 @@ static void cachefiles_write_complete(struct kiocb
*iocb, long ret)

        _enter("%ld", ret);

-       kiocb_end_write(iocb);
+       if (iocb->ki_flags & IOCB_WRITE)
+               kiocb_end_write(iocb);

        if (ret < 0)
                trace_cachefiles_io_error(object, inode, ret,
@@ -305,7 +306,7 @@ int __cachefiles_write(struct cachefiles_object *object,
        refcount_set(&ki->ki_refcnt, 2);
        ki->iocb.ki_filp        = file;
        ki->iocb.ki_pos         = start_pos;
-       ki->iocb.ki_flags       = IOCB_DIRECT | IOCB_WRITE;
+       ki->iocb.ki_flags       = IOCB_DIRECT;
        ki->iocb.ki_ioprio      = get_current_ioprio();
        ki->object              = object;
        ki->start               = start_pos;
@@ -319,16 +320,17 @@ int __cachefiles_write(struct cachefiles_object *object,
                ki->iocb.ki_complete = cachefiles_write_complete;
        atomic_long_add(ki->b_writing, &cache->b_writing);

-       kiocb_start_write(&ki->iocb);
-
        get_file(ki->iocb.ki_filp);
        cachefiles_grab_object(object, cachefiles_obj_get_ioreq);

        trace_cachefiles_write(object, file_inode(file), ki->iocb.ki_pos, len);
        old_nofs = memalloc_nofs_save();
        ret = cachefiles_inject_write_error();
-       if (ret == 0)
+       if (ret == 0) {
+               kiocb_start_write(&ki->iocb);
+               ki->iocb.ki_flags |= IOCB_WRITE;
                ret = vfs_iocb_iter_write(file, &ki->iocb, iter);
+       }
        memalloc_nofs_restore(old_nofs);
        switch (ret) {
        case -EIOCBQUEUED:




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux