Re: [PATCH V2] mds: fix CDir::_commit_partial() bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, 25 Nov 2012, Yan, Zheng wrote:
> On 11/25/2012 12:21 AM, Sage Weil wrote:
> > On Sat, 24 Nov 2012, Yan, Zheng wrote:
> >> Hi, Sage
> >>
> >> I found this fix is still not enough because we don't know if a dentry has been
> >> successfully deleted from the directory fragment after MDS restart. There are several
> >> options to fix this, which one do you like?
> >>
> >> 1. Make OSD ignore the TMAP_RM commands if it can't the find items.
> > 
> > This is certainly the simplest.  Since TMAP is effectively deprecated and 
> > everyone should be using the omap stuff anyway, it gets my vote.
> > 
> > Speaking of, we should transition the MDS to use the omap operations soon 
> > as well, since TMAP is horribly inefficient for large directories.  Also, 
> > the rmkey method in that case does not have this problem.
> 
> I can do this, it should be easy.
> 
> > 
> > Let's ignore enoent on TMAP_RM for bobtail, and transition to omap in the 
> > next release... do you want to test the patch that drops the !key_exists 
> > check?
> 
> I will modify the code and test it. but I think it is hiding bugs ;)
> 
> Another issue of tmap is that we may get different results when comparing two dentry_key_t
> and their string representation. So the dentries in tmap can be out of order, setkey method
> may add duplicated dentry and rmkey method may fail to find existing dentry. Since tmap is
> so buggy, let's switch the MDS to use the omap ASAP.

Yeah.  I think the question is what is the minimal set of fixes we should 
do to the TMAP implementation for bobtail before changing to omap.  The 
dentry_key_t ordering fix you posted the other day and ignoring the rm 
failures should catch these, right?

sage


> 
> Regards
> Yan, Zheng
> 
> 
> > 
> > Thanks!
> > sage
> > 
> >  
> >> 2. Add a new item deletion command for tmap, the new command doesn't make whole tmap
> >>    update operation fail if it can't the item. When committing directory fragments,
> >>    use the new command for null dentries that were added by MDS log replay.
> >>
> >> 3. When committing a directory fragment, Re-fetch it if it contains dirty+null dentries
> >>    that were added by MDS log replay. Find already deleted items and mark corresponding
> >>    dentries as new.
> >>
> >> Regards
> >> Yan, Zheng
> >>
> >> On 11/23/2012 12:52 AM, Yan, Zheng wrote:
> >>> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx>
> >>>
> >>> When a null dentry is encountered, CDir::_commit_partial() adds
> >>> a OSD_TMAP_RM command to delete the dentry. But if the dentry is
> >>> new, the osd will not find the dentry when handling the command
> >>> and the tmap update operation will fail totally.
> >>>
> >>> This patch also makes sure dentries are properly marked as new
> >>> when preparing new dentries and exporting dentries.
> >>>
> >>> Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx>
> >>> ---
> >>>  src/mds/CDentry.h  |  2 ++
> >>>  src/mds/CDir.cc    | 11 ++++++++---
> >>>  src/mds/CDir.h     |  2 +-
> >>>  src/mds/MDCache.cc |  9 ++++++---
> >>>  src/mds/Server.cc  |  3 +++
> >>>  5 files changed, 20 insertions(+), 7 deletions(-)
> >>>
> >>> diff --git a/src/mds/CDentry.h b/src/mds/CDentry.h
> >>> index 480e562..5755c55 100644
> >>> --- a/src/mds/CDentry.h
> >>> +++ b/src/mds/CDentry.h
> >>> @@ -347,6 +347,8 @@ public:
> >>>      // twiddle
> >>>      state = 0;
> >>>      state_set(CDentry::STATE_AUTH);
> >>> +    if (nstate & STATE_NEW)
> >>> +      mark_new();
> >>>      if (nstate & STATE_DIRTY)
> >>>        _mark_dirty(ls);
> >>>      if (!replica_map.empty())
> >>> diff --git a/src/mds/CDir.cc b/src/mds/CDir.cc
> >>> index c5220ed..411d864 100644
> >>> --- a/src/mds/CDir.cc
> >>> +++ b/src/mds/CDir.cc
> >>> @@ -1696,7 +1696,7 @@ class C_Dir_Committed : public Context {
> >>>  public:
> >>>    C_Dir_Committed(CDir *d, version_t v, version_t lrv) : dir(d), version(v), last_renamed_version(lrv) { }
> >>>    void finish(int r) {
> >>> -    dir->_committed(version, last_renamed_version);
> >>> +    dir->_committed(version, last_renamed_version, r);
> >>>    }
> >>>  };
> >>>  
> >>> @@ -1802,6 +1802,10 @@ CDir::map_t::iterator CDir::_commit_partial(ObjectOperation& m,
> >>>        continue;  // skip clean dentries
> >>>  
> >>>      if (dn->get_linkage()->is_null()) {
> >>> +      if (dn->is_new()) {
> >>> +	dn->mark_clean();
> >>> +	continue;
> >>> +      }
> >>>        dout(10) << " rm " << dn->name << " " << *dn << dendl;
> >>>        finalbl.append(CEPH_OSD_TMAP_RM);
> >>>        dn->key().encode(finalbl);
> >>> @@ -1997,10 +2001,11 @@ void CDir::_commit(version_t want)
> >>>   *
> >>>   * @param v version i just committed
> >>>   */
> >>> -void CDir::_committed(version_t v, version_t lrv)
> >>> +void CDir::_committed(version_t v, version_t lrv, int ret)
> >>>  {
> >>> -  dout(10) << "_committed v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> >>> +  dout(10) << "_committed ret " << ret << " v " << v << " (last renamed " << lrv << ") on " << *this << dendl;
> >>>    assert(is_auth());
> >>> +  assert(ret == 0);
> >>>  
> >>>    bool stray = inode->is_stray();
> >>>  
> >>> diff --git a/src/mds/CDir.h b/src/mds/CDir.h
> >>> index 2222418..274e38b 100644
> >>> --- a/src/mds/CDir.h
> >>> +++ b/src/mds/CDir.h
> >>> @@ -487,7 +487,7 @@ private:
> >>>                         unsigned max_write_size=-1,
> >>>                         map_t::iterator last_committed_dn=map_t::iterator());
> >>>    void _encode_dentry(CDentry *dn, bufferlist& bl, const set<snapid_t> *snaps);
> >>> -  void _committed(version_t v, version_t last_renamed_version);
> >>> +  void _committed(version_t v, version_t last_renamed_version, int ret);
> >>>    void wait_for_commit(Context *c, version_t v=0);
> >>>  
> >>>    // -- dirtyness --
> >>> diff --git a/src/mds/MDCache.cc b/src/mds/MDCache.cc
> >>> index f8b1c8f..e69a49f 100644
> >>> --- a/src/mds/MDCache.cc
> >>> +++ b/src/mds/MDCache.cc
> >>> @@ -657,12 +657,15 @@ CDentry *MDCache::get_or_create_stray_dentry(CInode *in)
> >>>    CDir *straydir = strayi->get_dirfrag(fg);
> >>>    assert(straydir);
> >>>    CDentry *straydn = straydir->lookup(straydname);
> >>> -  if (!straydn) {
> >>> +
> >>> +  if (!straydn)
> >>>      straydn = straydir->add_null_dentry(straydname);
> >>> -    straydn->mark_new();
> >>> -  } else 
> >>> +  else
> >>>      assert(straydn->get_projected_linkage()->is_null());
> >>>  
> >>> +  if (!straydn->is_dirty())
> >>> +    straydn->mark_new();
> >>> +
> >>>    return straydn;
> >>>  }
> >>>  
> >>> diff --git a/src/mds/Server.cc b/src/mds/Server.cc
> >>> index ec0d5d5..228fede 100644
> >>> --- a/src/mds/Server.cc
> >>> +++ b/src/mds/Server.cc
> >>> @@ -1685,6 +1685,9 @@ CDentry* Server::prepare_null_dentry(MDRequest *mdr, CDir *dir, const string& dn
> >>>        }
> >>>      }
> >>>  
> >>> +    if (!dn->is_dirty())
> >>> +      dn->mark_new();
> >>> +
> >>>      return dn;
> >>>    }
> >>>  
> >>>
> >>
> >>
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux