Re: [PATCH v2 1/1] NFSD: enhance inter-server copy cleanup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 5, 2023 at 4:11 PM Olga Kornievskaia <aglo@xxxxxxxxx> wrote:
>
> On Thu, Jan 5, 2023 at 3:00 PM <dai.ngo@xxxxxxxxxx> wrote:
> >
> > Hi Olga,
> >
> > On 1/5/23 8:10 AM, Olga Kornievskaia wrote:
> > > On Tue, Jan 3, 2023 at 11:14 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote:
> > >>
> > >>
> > >>> On Dec 18, 2022, at 7:55 PM, Dai Ngo <dai.ngo@xxxxxxxxxx> wrote:
> > >>>
> > >>> Currently nfsd4_setup_inter_ssc returns the vfsmount of the source
> > >>> server's export when the mount completes. After the copy is done
> > >>> nfsd4_cleanup_inter_ssc is called with the vfsmount of the source
> > >>> server and it searches nfsd_ssc_mount_list for a matching entry
> > >>> to do the clean up.
> > >>>
> > >>> The problems with this approach are (1) the need to search the
> > >>> nfsd_ssc_mount_list and (2) the code has to handle the case where
> > >>> the matching entry is not found which looks ugly.
> > >>>
> > >>> The enhancement is instead of nfsd4_setup_inter_ssc returning the
> > >>> vfsmount, it returns the nfsd4_ssc_umount_item which has the
> > >>> vfsmount embedded in it. When nfsd4_cleanup_inter_ssc is called
> > >>> it's passed with the nfsd4_ssc_umount_item directly to do the
> > >>> clean up so no searching is needed and there is no need to handle
> > >>> the 'not found' case.
> > >>>
> > >>> Signed-off-by: Dai Ngo <dai.ngo@xxxxxxxxxx>
> > >>> ---
> > >>> V2: fix compile error when CONFIG_NFSD_V4_2_INTER_SSC not defined.
> > >>>     Reported by kernel test robot.
> > >> Hello Dai - I've looked at this, nothing to comment on so far. I
> > >> plan to go over it again sometime this week.
> > >>
> > >> I'd like to hear from others before applying it.
> > > I have looked at it and logically it seems ok to me.
> >
> > Thank you for reviewing the patch.
> >
> > >   I have tested it
> > > (sorta. i'm rarely able to finish). But I keep running into the other
> > > problem (nfsd4_state_shrinker_count soft lockup that's been already
> > > reported). I find it interesting that only my destination server hits
> > > the problem (but not the source server). I don't believe this patch
> > > has anything to do with this problem, but I found it interesting that
> > > ssc testing seems to trigger it 100%.
> >
> > It's strange that you and Mike keep having this problem, I've been trying
> > to reproduce the problem using Mike's procedure with no success.
> >
> >  From Mike's report it seems that the struct delayed_work, part of the
> > nfsd_net, was freed when nfsd4_state_shrinker_count was called. I'm trying
> > to see why this could happen. Currently we call unregister_shrinker from
> > nfsd_exit_net. Perhaps there is another path that the nfsd_net can be
> > freed?
> >
> > Can you share your test procedure so I can try?
>
> I have nothing special. I have 3 RHEL8 VMs running upstream kernels. 2
> servers and 1 client. I'm just running nfstest_ssc --runtest inter01.
> Given that the trace says it's kswapd that has this trace, doesn't it
> mean my VM is stressed for memory perhaps. So perhaps see if you can
> reduce your VM memsize? My VM has 2G of memory.
>
> I have reverted a1049eb47f20 commit but that didn't help.

Ops. I reverted the wrong commit(s). Reverted 44df6f439a17,
3959066b697b, and the tracepoint one for cb_recall_any. I can run
clean thru the ssc tests with this new patch.

>
>
> >
> > Thanks,
> > -Dai
> >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >>> fs/nfsd/nfs4proc.c      | 94 +++++++++++++++++++------------------------------
> > >>> fs/nfsd/xdr4.h          |  2 +-
> > >>> include/linux/nfs_ssc.h |  2 +-
> > >>> 3 files changed, 38 insertions(+), 60 deletions(-)
> > >>>
> > >>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > >>> index b79ee65ae016..6515b00520bc 100644
> > >>> --- a/fs/nfsd/nfs4proc.c
> > >>> +++ b/fs/nfsd/nfs4proc.c
> > >>> @@ -1295,15 +1295,15 @@ extern void nfs_sb_deactive(struct super_block *sb);
> > >>>   * setup a work entry in the ssc delayed unmount list.
> > >>>   */
> > >>> static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
> > >>> -             struct nfsd4_ssc_umount_item **retwork, struct vfsmount **ss_mnt)
> > >>> +             struct nfsd4_ssc_umount_item **nsui)
> > >>> {
> > >>>        struct nfsd4_ssc_umount_item *ni = NULL;
> > >>>        struct nfsd4_ssc_umount_item *work = NULL;
> > >>>        struct nfsd4_ssc_umount_item *tmp;
> > >>>        DEFINE_WAIT(wait);
> > >>> +     __be32 status = 0;
> > >>>
> > >>> -     *ss_mnt = NULL;
> > >>> -     *retwork = NULL;
> > >>> +     *nsui = NULL;
> > >>>        work = kzalloc(sizeof(*work), GFP_KERNEL);
> > >>> try_again:
> > >>>        spin_lock(&nn->nfsd_ssc_lock);
> > >>> @@ -1326,12 +1326,12 @@ static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
> > >>>                        finish_wait(&nn->nfsd_ssc_waitq, &wait);
> > >>>                        goto try_again;
> > >>>                }
> > >>> -             *ss_mnt = ni->nsui_vfsmount;
> > >>> +             *nsui = ni;
> > >>>                refcount_inc(&ni->nsui_refcnt);
> > >>>                spin_unlock(&nn->nfsd_ssc_lock);
> > >>>                kfree(work);
> > >>>
> > >>> -             /* return vfsmount in ss_mnt */
> > >>> +             /* return vfsmount in (*nsui)->nsui_vfsmount */
> > >>>                return 0;
> > >>>        }
> > >>>        if (work) {
> > >>> @@ -1339,10 +1339,11 @@ static __be32 nfsd4_ssc_setup_dul(struct nfsd_net *nn, char *ipaddr,
> > >>>                refcount_set(&work->nsui_refcnt, 2);
> > >>>                work->nsui_busy = true;
> > >>>                list_add_tail(&work->nsui_list, &nn->nfsd_ssc_mount_list);
> > >>> -             *retwork = work;
> > >>> -     }
> > >>> +             *nsui = work;
> > >>> +     } else
> > >>> +             status = nfserr_resource;
> > >>>        spin_unlock(&nn->nfsd_ssc_lock);
> > >>> -     return 0;
> > >>> +     return status;
> > >>> }
> > >>>
> > >>> static void nfsd4_ssc_update_dul_work(struct nfsd_net *nn,
> > >>> @@ -1371,7 +1372,7 @@ static void nfsd4_ssc_cancel_dul_work(struct nfsd_net *nn,
> > >>>   */
> > >>> static __be32
> > >>> nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>> -                    struct vfsmount **mount)
> > >>> +                     struct nfsd4_ssc_umount_item **nsui)
> > >>> {
> > >>>        struct file_system_type *type;
> > >>>        struct vfsmount *ss_mnt;
> > >>> @@ -1382,7 +1383,6 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>>        char *ipaddr, *dev_name, *raw_data;
> > >>>        int len, raw_len;
> > >>>        __be32 status = nfserr_inval;
> > >>> -     struct nfsd4_ssc_umount_item *work = NULL;
> > >>>        struct nfsd_net *nn = net_generic(SVC_NET(rqstp), nfsd_net_id);
> > >>>
> > >>>        naddr = &nss->u.nl4_addr;
> > >>> @@ -1390,6 +1390,7 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>>                                         naddr->addr_len,
> > >>>                                         (struct sockaddr *)&tmp_addr,
> > >>>                                         sizeof(tmp_addr));
> > >>> +     *nsui = NULL;
> > >>>        if (tmp_addrlen == 0)
> > >>>                goto out_err;
> > >>>
> > >>> @@ -1432,10 +1433,10 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>>                goto out_free_rawdata;
> > >>>        snprintf(dev_name, len + 5, "%s%s%s:/", startsep, ipaddr, endsep);
> > >>>
> > >>> -     status = nfsd4_ssc_setup_dul(nn, ipaddr, &work, &ss_mnt);
> > >>> +     status = nfsd4_ssc_setup_dul(nn, ipaddr, nsui);
> > >>>        if (status)
> > >>>                goto out_free_devname;
> > >>> -     if (ss_mnt)
> > >>> +     if ((*nsui)->nsui_vfsmount)
> > >>>                goto out_done;
> > >>>
> > >>>        /* Use an 'internal' mount: SB_KERNMOUNT -> MNT_INTERNAL */
> > >>> @@ -1443,15 +1444,12 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>>        module_put(type->owner);
> > >>>        if (IS_ERR(ss_mnt)) {
> > >>>                status = nfserr_nodev;
> > >>> -             if (work)
> > >>> -                     nfsd4_ssc_cancel_dul_work(nn, work);
> > >>> +             nfsd4_ssc_cancel_dul_work(nn, *nsui);
> > >>>                goto out_free_devname;
> > >>>        }
> > >>> -     if (work)
> > >>> -             nfsd4_ssc_update_dul_work(nn, work, ss_mnt);
> > >>> +     nfsd4_ssc_update_dul_work(nn, *nsui, ss_mnt);
> > >>> out_done:
> > >>>        status = 0;
> > >>> -     *mount = ss_mnt;
> > >>>
> > >>> out_free_devname:
> > >>>        kfree(dev_name);
> > >>> @@ -1474,8 +1472,7 @@ nfsd4_interssc_connect(struct nl4_server *nss, struct svc_rqst *rqstp,
> > >>>   */
> > >>> static __be32
> > >>> nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
> > >>> -                   struct nfsd4_compound_state *cstate,
> > >>> -                   struct nfsd4_copy *copy, struct vfsmount **mount)
> > >>> +             struct nfsd4_compound_state *cstate, struct nfsd4_copy *copy)
> > >>> {
> > >>>        struct svc_fh *s_fh = NULL;
> > >>>        stateid_t *s_stid = &copy->cp_src_stateid;
> > >>> @@ -1488,7 +1485,8 @@ nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
> > >>>        if (status)
> > >>>                goto out;
> > >>>
> > >>> -     status = nfsd4_interssc_connect(copy->cp_src, rqstp, mount);
> > >>> +     status = nfsd4_interssc_connect(copy->cp_src, rqstp,
> > >>> +                             &copy->ss_nsui);
> > >>>        if (status)
> > >>>                goto out;
> > >>>
> > >>> @@ -1506,61 +1504,42 @@ nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
> > >>> }
> > >>>
> > >>> static void
> > >>> -nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct file *filp,
> > >>> +nfsd4_cleanup_inter_ssc(struct nfsd4_ssc_umount_item *ni, struct file *filp,
> > >>>                        struct nfsd_file *dst)
> > >>> {
> > >>> -     bool found = false;
> > >>>        long timeout;
> > >>> -     struct nfsd4_ssc_umount_item *tmp;
> > >>> -     struct nfsd4_ssc_umount_item *ni = NULL;
> > >>>        struct nfsd_net *nn = net_generic(dst->nf_net, nfsd_net_id);
> > >>>
> > >>>        nfs42_ssc_close(filp);
> > >>>        nfsd_file_put(dst);
> > >>>        fput(filp);
> > >>>
> > >>> -     if (!nn) {
> > >>> -             mntput(ss_mnt);
> > >>> -             return;
> > >>> -     }
> > >>>        spin_lock(&nn->nfsd_ssc_lock);
> > >>>        timeout = msecs_to_jiffies(nfsd4_ssc_umount_timeout);
> > >>> -     list_for_each_entry_safe(ni, tmp, &nn->nfsd_ssc_mount_list, nsui_list) {
> > >>> -             if (ni->nsui_vfsmount->mnt_sb == ss_mnt->mnt_sb) {
> > >>> -                     list_del(&ni->nsui_list);
> > >>> -                     /*
> > >>> -                      * vfsmount can be shared by multiple exports,
> > >>> -                      * decrement refcnt. If the count drops to 1 it
> > >>> -                      * will be unmounted when nsui_expire expires.
> > >>> -                      */
> > >>> -                     refcount_dec(&ni->nsui_refcnt);
> > >>> -                     ni->nsui_expire = jiffies + timeout;
> > >>> -                     list_add_tail(&ni->nsui_list, &nn->nfsd_ssc_mount_list);
> > >>> -                     found = true;
> > >>> -                     break;
> > >>> -             }
> > >>> -     }
> > >>> +     list_del(&ni->nsui_list);
> > >>> +     /*
> > >>> +      * vfsmount can be shared by multiple exports,
> > >>> +      * decrement refcnt. If the count drops to 1 it
> > >>> +      * will be unmounted when nsui_expire expires.
> > >>> +      */
> > >>> +     refcount_dec(&ni->nsui_refcnt);
> > >>> +     ni->nsui_expire = jiffies + timeout;
> > >>> +     list_add_tail(&ni->nsui_list, &nn->nfsd_ssc_mount_list);
> > >>>        spin_unlock(&nn->nfsd_ssc_lock);
> > >>> -     if (!found) {
> > >>> -             mntput(ss_mnt);
> > >>> -             return;
> > >>> -     }
> > >>> }
> > >>>
> > >>> #else /* CONFIG_NFSD_V4_2_INTER_SSC */
> > >>>
> > >>> static __be32
> > >>> nfsd4_setup_inter_ssc(struct svc_rqst *rqstp,
> > >>> -                   struct nfsd4_compound_state *cstate,
> > >>> -                   struct nfsd4_copy *copy,
> > >>> -                   struct vfsmount **mount)
> > >>> +                     struct nfsd4_compound_state *cstate,
> > >>> +                     struct nfsd4_copy *copy)
> > >>> {
> > >>> -     *mount = NULL;
> > >>>        return nfserr_inval;
> > >>> }
> > >>>
> > >>> static void
> > >>> -nfsd4_cleanup_inter_ssc(struct vfsmount *ss_mnt, struct file *filp,
> > >>> +nfsd4_cleanup_inter_ssc(struct nfsd4_ssc_umount_item *ni, struct file *filp,
> > >>>                        struct nfsd_file *dst)
> > >>> {
> > >>> }
> > >>> @@ -1700,7 +1679,7 @@ static void dup_copy_fields(struct nfsd4_copy *src, struct nfsd4_copy *dst)
> > >>>        memcpy(dst->cp_src, src->cp_src, sizeof(struct nl4_server));
> > >>>        memcpy(&dst->stateid, &src->stateid, sizeof(src->stateid));
> > >>>        memcpy(&dst->c_fh, &src->c_fh, sizeof(src->c_fh));
> > >>> -     dst->ss_mnt = src->ss_mnt;
> > >>> +     dst->ss_nsui = src->ss_nsui;
> > >>> }
> > >>>
> > >>> static void cleanup_async_copy(struct nfsd4_copy *copy)
> > >>> @@ -1749,8 +1728,8 @@ static int nfsd4_do_async_copy(void *data)
> > >>>        if (nfsd4_ssc_is_inter(copy)) {
> > >>>                struct file *filp;
> > >>>
> > >>> -             filp = nfs42_ssc_open(copy->ss_mnt, &copy->c_fh,
> > >>> -                                   &copy->stateid);
> > >>> +             filp = nfs42_ssc_open(copy->ss_nsui->nsui_vfsmount,
> > >>> +                             &copy->c_fh, &copy->stateid);
> > >>>                if (IS_ERR(filp)) {
> > >>>                        switch (PTR_ERR(filp)) {
> > >>>                        case -EBADF:
> > >>> @@ -1764,7 +1743,7 @@ static int nfsd4_do_async_copy(void *data)
> > >>>                }
> > >>>                nfserr = nfsd4_do_copy(copy, filp, copy->nf_dst->nf_file,
> > >>>                                       false);
> > >>> -             nfsd4_cleanup_inter_ssc(copy->ss_mnt, filp, copy->nf_dst);
> > >>> +             nfsd4_cleanup_inter_ssc(copy->ss_nsui, filp, copy->nf_dst);
> > >>>        } else {
> > >>>                nfserr = nfsd4_do_copy(copy, copy->nf_src->nf_file,
> > >>>                                       copy->nf_dst->nf_file, false);
> > >>> @@ -1790,8 +1769,7 @@ nfsd4_copy(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> > >>>                        status = nfserr_notsupp;
> > >>>                        goto out;
> > >>>                }
> > >>> -             status = nfsd4_setup_inter_ssc(rqstp, cstate, copy,
> > >>> -                             &copy->ss_mnt);
> > >>> +             status = nfsd4_setup_inter_ssc(rqstp, cstate, copy);
> > >>>                if (status)
> > >>>                        return nfserr_offload_denied;
> > >>>        } else {
> > >>> diff --git a/fs/nfsd/xdr4.h b/fs/nfsd/xdr4.h
> > >>> index 0eb00105d845..36c3340c1d54 100644
> > >>> --- a/fs/nfsd/xdr4.h
> > >>> +++ b/fs/nfsd/xdr4.h
> > >>> @@ -571,7 +571,7 @@ struct nfsd4_copy {
> > >>>        struct task_struct      *copy_task;
> > >>>        refcount_t              refcount;
> > >>>
> > >>> -     struct vfsmount         *ss_mnt;
> > >>> +     struct nfsd4_ssc_umount_item *ss_nsui;
> > >>>        struct nfs_fh           c_fh;
> > >>>        nfs4_stateid            stateid;
> > >>> };
> > >>> diff --git a/include/linux/nfs_ssc.h b/include/linux/nfs_ssc.h
> > >>> index 75843c00f326..22265b1ff080 100644
> > >>> --- a/include/linux/nfs_ssc.h
> > >>> +++ b/include/linux/nfs_ssc.h
> > >>> @@ -53,6 +53,7 @@ static inline void nfs42_ssc_close(struct file *filep)
> > >>>        if (nfs_ssc_client_tbl.ssc_nfs4_ops)
> > >>>                (*nfs_ssc_client_tbl.ssc_nfs4_ops->sco_close)(filep);
> > >>> }
> > >>> +#endif
> > >>>
> > >>> struct nfsd4_ssc_umount_item {
> > >>>        struct list_head nsui_list;
> > >>> @@ -66,7 +67,6 @@ struct nfsd4_ssc_umount_item {
> > >>>        struct vfsmount *nsui_vfsmount;
> > >>>        char nsui_ipaddr[RPC_MAX_ADDRBUFLEN + 1];
> > >>> };
> > >>> -#endif
> > >>>
> > >>> /*
> > >>>   * NFS_FS
> > >>> --
> > >>> 2.9.5
> > >>>
> > >> --
> > >> Chuck Lever
> > >>
> > >>
> > >>



[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux