Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Apr 26, 2014 at 9:56 AM, Jingyuan Luke <jyluke@xxxxxxxxx> wrote:
> Hi Greg,
>
> Actually our cluster is pretty empty, but we suspect we had a temporary
> network disconnection to one of our OSD, not sure if this caused the
> problem.
>
> Anyway we don't mind try the method you mentioned, how can we do that?
>

compile ceph-mds with the attached patch. add a line "mds
wipe_sessions = 1" to the ceph.conf,

Yan, Zheng

> Regards,
> Luke
>
>
> On Saturday, April 26, 2014, Gregory Farnum <greg@xxxxxxxxxxx> wrote:
>>
>> Hmm, it looks like your on-disk SessionMap is horrendously out of
>> date. Did your cluster get full at some point?
>>
>> In any case, we're working on tools to repair this now but they aren't
>> ready for use yet. Probably the only thing you could do is create an
>> empty sessionmap with a higher version than the ones the journal
>> refers to, but that might have other fallout effects...
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> On Fri, Apr 25, 2014 at 2:57 AM, Mohd Bazli Ab Karim
>> <bazli.abkarim@xxxxxxxx> wrote:
>> > More logs. I ran ceph-mds  with debug-mds=20.
>> >
>> > -2> 2014-04-25 17:47:54.839672 7f0d6f3f0700 10 mds.0.journal
>> > EMetaBlob.replay inotable tablev 4316124 <= table 4317932
>> > -1> 2014-04-25 17:47:54.839674 7f0d6f3f0700 10 mds.0.journal
>> > EMetaBlob.replay sessionmap v8632368 -(1|2) == table 7239603 prealloc
>> > [1000041df86~1] used 1000041db9e
>> >   0> 2014-04-25 17:47:54.840733 7f0d6f3f0700 -1 mds/journal.cc: In
>> > function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' thread
>> > 7f0d6f3f0700 time 2014-04-25 17:47:54.839688 mds/journal.cc: 1303: FAILED
>> > assert(session)
>> >
>> > Please look at the attachment for more details.
>> >
>> > Regards,
>> > Bazli
>> >
>> > From: Mohd Bazli Ab Karim
>> > Sent: Friday, April 25, 2014 12:26 PM
>> > To: 'ceph-devel@xxxxxxxxxxxxxxx'; ceph-users@xxxxxxxxxxxxxx
>> > Subject: Ceph mds laggy and failed assert in function replay
>> > mds/journal.cc
>> >
>> > Dear Ceph-devel, ceph-users,
>> >
>> > I am currently facing issue with my ceph mds server. Ceph-mds daemon
>> > does not want to bring up back.
>> > Tried running that manually with ceph-mds -i mon01 -d but it shows that
>> > it stucks at failed assert(session) line 1303 in mds/journal.cc and aborted.
>> >
>> > Can someone shed some light in this issue.
>> > ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60)
>> >
>> > Let me know if I need to send log with debug enabled.
>> >
>> > Regards,
>> > Bazli
>> >
>> > ________________________________
>> > DISCLAIMER:
>> >
>> >
>> > This e-mail (including any attachments) is for the addressee(s) only and
>> > may be confidential, especially as regards personal data. If you are not the
>> > intended recipient, please note that any dealing, review, distribution,
>> > printing, copying or use of this e-mail is strictly prohibited. If you have
>> > received this email in error, please notify the sender immediately and
>> > delete the original message (including any attachments).
>> >
>> >
>> > MIMOS Berhad is a research and development institution under the purview
>> > of the Malaysian Ministry of Science, Technology and Innovation. Opinions,
>> > conclusions and other information in this e-mail that do not relate to the
>> > official business of MIMOS Berhad and/or its subsidiaries shall be
>> > understood as neither given nor endorsed by MIMOS Berhad and/or its
>> > subsidiaries and neither MIMOS Berhad nor its subsidiaries accepts
>> > responsibility for the same. All liability arising from or in connection
>> > with computer viruses and/or corrupted e-mails is excluded to the fullest
>> > extent permitted by law.
>> >
>> >
>> > ------------------------------------------------------------------
>> > -
>> > -
>> > DISCLAIMER:
>> >
>> > This e-mail (including any attachments) is for the addressee(s)
>> > only and may contain confidential information. If you are not the
>> > intended recipient, please note that any dealing, review,
>> > distribution, printing, copying or use of this e-mail is strictly
>> > prohibited. If you have received this email in error, please notify
>> > the sender  immediately and delete the original message.
>> > MIMOS Berhad is a research and development institution under
>> > the purview of the Malaysian Ministry of Science, Technology and
>> > Innovation. Opinions, conclusions and other information in this e-
>> > mail that do not relate to the official business of MIMOS Berhad
>> > and/or its subsidiaries shall be understood as neither given nor
>> > endorsed by MIMOS Berhad and/or its subsidiaries and neither
>> > MIMOS Berhad nor its subsidiaries accepts responsibility for the
>> > same. All liability arising from or in connection with computer
>> > viruses and/or corrupted e-mails is excluded to the fullest extent
>> > permitted by law.
>> >
>> >
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
From 989084f9703ad6c74adc31abedbbb0e308425683 Mon Sep 17 00:00:00 2001
From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx>
Date: Sun, 27 Apr 2014 11:53:13 +0800
Subject: [PATCH] ceph: skip bad sessionmap during journal replay

Signed-off-by: Yan, Zheng <zheng.z.yan@xxxxxxxxx>
---
 src/mds/journal.cc | 59 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 36 insertions(+), 23 deletions(-)

diff --git a/src/mds/journal.cc b/src/mds/journal.cc
index 41a79f9..7c19636 100644
--- a/src/mds/journal.cc
+++ b/src/mds/journal.cc
@@ -1293,36 +1293,49 @@ void EMetaBlob::replay(MDS *mds, LogSegment *logseg, MDSlaveUpdate *slaveup)
     if (mds->sessionmap.version >= sessionmapv) {
       dout(10) << "EMetaBlob.replay sessionmap v " << sessionmapv
 	       << " <= table " << mds->sessionmap.version << dendl;
-    } else {
-      dout(10) << "EMetaBlob.replay sessionmap v" << sessionmapv
+    } else if (mds->sessionmap.version + 2 >= sessionmapv) {
+      dout(10) << "EMetaBlob.replay sessionmap v " << sessionmapv
 	       << " -(1|2) == table " << mds->sessionmap.version
 	       << " prealloc " << preallocated_inos
 	       << " used " << used_preallocated_ino
 	       << dendl;
       Session *session = mds->sessionmap.get_session(client_name);
-      assert(session);
-      dout(20) << " (session prealloc " << session->info.prealloc_inos << ")" << dendl;
-      if (used_preallocated_ino) {
-	if (session->info.prealloc_inos.empty()) {
-	  // HRM: badness in the journal
-	  mds->clog.warn() << " replayed op " << client_reqs << " on session for " << client_name
-			   << " with empty prealloc_inos\n";
-	} else {
-	  inodeno_t next = session->next_ino();
-	  inodeno_t i = session->take_ino(used_preallocated_ino);
-	  if (next != i)
-	    mds->clog.warn() << " replayed op " << client_reqs << " used ino " << i
-			     << " but session next is " << next << "\n";
-	  assert(i == used_preallocated_ino);
-	  session->info.used_inos.clear();
+      if (session) {
+	dout(20) << " (session prealloc " << session->info.prealloc_inos << ")" << dendl;
+	if (used_preallocated_ino) {
+	  if (session->info.prealloc_inos.empty()) {
+	    // HRM: badness in the journal
+	    mds->clog.warn() << " replayed op " << client_reqs << " on session for "
+			     << client_name << " with empty prealloc_inos\n";
+	  } else {
+	    inodeno_t next = session->next_ino();
+	    inodeno_t i = session->take_ino(used_preallocated_ino);
+	    if (next != i)
+	      mds->clog.warn() << " replayed op " << client_reqs << " used ino " << i
+			       << " but session next is " << next << "\n";
+	    assert(i == used_preallocated_ino);
+	    session->info.used_inos.clear();
+	  }
+	  mds->sessionmap.projected = ++mds->sessionmap.version;
 	}
-	mds->sessionmap.projected = ++mds->sessionmap.version;
-      }
-      if (preallocated_inos.size()) {
-	session->info.prealloc_inos.insert(preallocated_inos);
-	mds->sessionmap.projected = ++mds->sessionmap.version;
+	if (preallocated_inos.size()) {
+	  session->info.prealloc_inos.insert(preallocated_inos);
+	  mds->sessionmap.projected = ++mds->sessionmap.version;
+	}
+	assert(sessionmapv == mds->sessionmap.version);
+      } else {
+	mds->clog.warn() << " replayed op " << client_reqs << " no session for "
+			 << client_name << "\n";
+	mds->sessionmap.version = sessionmapv;
+	mds->sessionmap.projected = sessionmapv;
       }
-      assert(sessionmapv == mds->sessionmap.version);
+    } else {
+      mds->clog.error() << "journal replay sessionmap v " << sessionmapv
+			<< " -(1|2) > table " << mds->sessionmap.version << "\n";
+      assert(g_conf->mds_wipe_sessions);
+      mds->sessionmap.wipe();
+      mds->sessionmap.version = sessionmapv;
+      mds->sessionmap.projected = sessionmapv;
     }
   }
 
-- 
1.9.0


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux