Re: [PATCH 1/2] NFSD: Fix use-after-free warning when doing inter-server copy

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Feb 22, 2021 at 5:09 PM <dai.ngo@xxxxxxxxxx> wrote:
>
>
> On 2/22/21 2:01 PM, dai.ngo@xxxxxxxxxx wrote:
> >
> > On 2/22/21 1:46 PM, dai.ngo@xxxxxxxxxx wrote:
> >>
> >> On 2/22/21 10:34 AM, dai.ngo@xxxxxxxxxx wrote:
> >>>
> >>> On 2/20/21 8:16 PM, dai.ngo@xxxxxxxxxx wrote:
> >>>> On 2/20/21 6:08 AM, Olga Kornievskaia wrote:
> >>>>> On Fri, Feb 19, 2021 at 10:21 PM J. Bruce Fields
> >>>>> <bfields@xxxxxxxxxxxx> wrote:
> >>>>>> On Fri, Feb 19, 2021 at 05:31:58PM -0800, dai.ngo@xxxxxxxxxx wrote:
> >>>>>>> If this is the cause why we don't drop the mount after the copy
> >>>>>>> then I can restore the patch and look into this problem.
> >>>>>>> Unfortunately,
> >>>>>>> all my test machines are down for maintenance until Sunday/Monday.
> >>>>>> I think we can take some time to figure out what's actually going on
> >>>>>> here before reverting anything.
> >>>>> Yes I agree. We need to fix the use-after-free and also make sure
> >>>>> that
> >>>>> reference will go away.
> >>>
> >>> I reverted the patch, verified the warning message is back:
> >>>
> >>> Feb 22 10:07:45 nfsvmf24 kernel: ------------[ cut here ]------------
> >>> Feb 22 10:07:45 nfsvmf24 kernel: refcount_t: underflow; use-after-free.
> >>>
> >>> then did a inter-server copy and waited for more than 20 mins and
> >>> the destination server still maintains the session with the source
> >>> server.  It must be some other references that prevent the mount
> >>> to go away.
> >>
> >> This change fixed the unmount after inter-server copy problem:
> >>
> >> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> >> index 8d6d2678abad..87687cd18938 100644
> >> --- a/fs/nfsd/nfs4proc.c
> >> +++ b/fs/nfsd/nfs4proc.c
> >> @@ -1304,7 +1304,7 @@ nfsd4_cleanup_inter_ssc(struct vfsmount
> >> *ss_mnt, struct nfsd_file *src,
> >>                         struct nfsd_file *dst)
> >>  {
> >>         nfs42_ssc_close(src->nf_file);
> >> -       /* 'src' is freed by nfsd4_do_async_copy */
> >> +       nfsd_file_put(src);
> >>         nfsd_file_put(dst);
> >>         mntput(ss_mnt);
> >>  }
>
> This change is not need. It's left over from my testing to
> reproduce the warning messages. Only the change in
> nfsd4_do_async_copy is needed for the unmount problem.
>
> -Dai
>
> >> @@ -1472,14 +1472,12 @@ static int nfsd4_do_async_copy(void *data)
> >>                 copy->nf_src = kzalloc(sizeof(struct nfsd_file),
> >> GFP_KERNEL);
> >>                 if (!copy->nf_src) {
> >>                         copy->nfserr = nfserr_serverfault;
> >> - nfsd4_interssc_disconnect(copy->ss_mnt);
> >>                         goto do_callback;
> >>                 }
> >>                 copy->nf_src->nf_file = nfs42_ssc_open(copy->ss_mnt,
> >> &copy->c_fh,
> >> &copy->stateid);
> >>                 if (IS_ERR(copy->nf_src->nf_file)) {
> >>                         copy->nfserr = nfserr_offload_denied;
> >> - nfsd4_interssc_disconnect(copy->ss_mnt);
> >>                         goto do_callback;
> >>                 }
> >>         }
> >> @@ -1498,6 +1496,7 @@ static int nfsd4_do_async_copy(void *data)
> >>                         &nfsd4_cb_offload_ops,
> >> NFSPROC4_CLNT_CB_OFFLOAD);
> >>         nfsd4_run_cb(&cb_copy->cp_cb);
> >>  out:
> >> +       nfsd4_interssc_disconnect(copy->ss_mnt);
> >>         if (!copy->cp_intra)
> >>                 kfree(copy->nf_src);
> >>         cleanup_async_copy(copy);
> >>
> >> But there is something new. I tried inter-server copy twice.
> >> First time I can verify from tshark capture that a session was
> >> created and destroy, along with all the NFS ops. On 2nd try,
> >> I can

Hi Dai/Bruce,

While I believe the fix works (as in the mount goes away), I'm not
comfortable with this fix as I believe we will be leaking resources.
Server calls nfs42_ssc_open() which creates a legit file pointer (yes
it takes a reference on superblock but it also allocates memory for
"file" structure). Normally a file structure requires doing an fput()
which if I look thru the code does a bunch of things before in the end
calling mntput(). While we free the copy->nf_src in
nfs4_do_asyn_copy(), the copy->nf_src->nf_file was allocated
separately and would have been freed from calling fput() on it.

So I guess it's not correct to do kfree(copy->nf_src) in teh
nfs4_do_async_copy() I think that's probably where use-after-free
comes in. We need to keep it until the cleanup. I'm thinking perhaps
call fput(copy->nf_src->nf_file) first and then free it? Just an idea
but I haven't tested it.



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux