Re: CephFS no longer mounts and asserts in MDS after upgrade to 0.67.3

Oliver Daudey <oliver@xxxxxxxxx> · Tue, 10 Sep 2013 23:36:09 +0200

Hey Gregory,

My cluster consists of 3 nodes, each running 1 mon, 1 osd and 1 mds.  I
upgraded from 0.67, but was still running 0.61.7 OSDs at the time of the
upgrade, because of performance-issues that have just recently been
fixed.  These have now been upgraded to 0.67.3, along with the rest of
Ceph.  My OSDs are using XFS as the underlying FS.  I have been
switching one OSD in my cluster back and forth between 0.61.7 and some
test-versions, which where based on 0.67.x, to debug aforementioned
performance-issues with Samuel, but that was before I newfs'ed and
started using this instance of CephFS.  Furthermore, I don't seem to
have lost any other data during these tests.

BTW: CephFS has never been very stable for me during stress-tests.  If
some components are brought down and back up again during operations,
like stopping and restarting all components on one node while generating
some load with a cp of a big CephFS directory-tree on another, then,
once things settle again, doing the same on another node, it always
quickly ends up like what I see now.  MDSs crashing on start or on
attempts to mount the CephFS and the only way out being to stop the
MDSs, wipe the contents of the "data" and "metadata"-pools and doing the
newfs-thing.  I can only assume you guys are putting it through similar
stress-tests, but if not, try it.

PS: Is there a way to get back at the data after something like this?
Do you still want me to keep the current situation to debug it further,
or can I zap everything, restore my backups and move on?  Thanks!

   Regards,

      Oliver

On di, 2013-09-10 at 13:59 -0700, Gregory Farnum wrote:
> It's not an upgrade issue. There's an MDS object that is somehow
> missing. If it exists, then on restart you'll be fine.
> 
> Oliver, what is your general cluster config? What filesystem are your
> OSDs running on? What version of Ceph were you upgrading from? There's
> really no way for this file to not exist once created unless the
> underlying FS ate it or the last write both was interrupted and hit
> some kind of bug in our transaction code (of which none are known)
> during replay.
> -Greg
> Software Engineer #42 @ http://inktank.com | http://ceph.com
> 
> 
> On Tue, Sep 10, 2013 at 1:44 PM, Liu, Larry <Larry.Liu@xxxxxxxxxx> wrote:
> > This is scary. Should I hold on upgrade?
> >
> > On 9/10/13 11:33 AM, "Oliver Daudey" <oliver@xxxxxxxxx> wrote:
> >
> >>Hey Gregory,
> >>
> >>On 10-09-13 20:21, Gregory Farnum wrote:
> >>> On Tue, Sep 10, 2013 at 10:54 AM, Oliver Daudey <oliver@xxxxxxxxx>
> >>>wrote:
> >>>> Hey list,
> >>>>
> >>>> I just upgraded to Ceph 0.67.3.  What I did on every node of my 3-node
> >>>> cluster was:
> >>>> - Unmount CephFS everywhere.
> >>>> - Upgrade the Ceph-packages.
> >>>> - Restart MON.
> >>>> - Restart OSD.
> >>>> - Restart MDS.
> >>>>
> >>>> As soon as I got to the second node, the MDS crashed right after
> >>>>startup.
> >>>>
> >>>> Part of the logs (more on request):
> >>>>
> >>>> -> 194.109.43.12:6802/53419 -- osd_op(mds.0.58:4 mds_snaptable [read
> >>>> 0~0] 1.d902
> >>>> 70ad e37647) v4 -- ?+0 0x1e48d80 con 0x1e5d9a0
> >>>>    -11> 2013-09-10 19:35:02.798962 7fd1ba81f700  2 mds.0.58 boot_start
> >>>> 1: openin
> >>>> g mds log
> >>>>    -10> 2013-09-10 19:35:02.798968 7fd1ba81f700  5 mds.0.log open
> >>>> discovering lo
> >>>> g bounds
> >>>>     -9> 2013-09-10 19:35:02.798988 7fd1ba81f700  1 mds.0.journaler(ro)
> >>>> recover s
> >>>> tart
> >>>>     -8> 2013-09-10 19:35:02.798990 7fd1ba81f700  1 mds.0.journaler(ro)
> >>>> read_head
> >>>>     -7> 2013-09-10 19:35:02.799028 7fd1ba81f700  1 --
> >>>> 194.109.43.12:6800/67277 -
> >>>> -> 194.109.43.11:6800/16562 -- osd_op(mds.0.58:5 200.00000000 [read
> >>>>0~0]
> >>>> 1.844f3
> >>>> 494 e37647) v4 -- ?+0 0x1e48b40 con 0x1e5db00
> >>>>     -6> 2013-09-10 19:35:02.799053 7fd1ba81f700  1 --
> >>>> 194.109.43.12:6800/67277 <
> >>>> == mon.2 194.109.43.13:6789/0 16 ==== mon_subscribe_ack(300s) v1 ====
> >>>> 20+0+0 (42
> >>>> 35168662 0 0) 0x1e93380 con 0x1e5d580
> >>>>     -5> 2013-09-10 19:35:02.799099 7fd1ba81f700 10 monclient:
> >>>> handle_subscribe_a
> >>>> ck sent 2013-09-10 19:35:02.796448 renew after 2013-09-10
> >>>>19:37:32.796448
> >>>>     -4> 2013-09-10 19:35:02.800907 7fd1ba81f700  5 mds.0.58
> >>>> ms_handle_connect on
> >>>>  194.109.43.12:6802/53419
> >>>>     -3> 2013-09-10 19:35:02.800927 7fd1ba81f700  5 mds.0.58
> >>>> ms_handle_connect on
> >>>>  194.109.43.13:6802/45791
> >>>>     -2> 2013-09-10 19:35:02.801176 7fd1ba81f700  5 mds.0.58
> >>>> ms_handle_connect on
> >>>>  194.109.43.11:6800/16562
> >>>>     -1> 2013-09-10 19:35:02.803546 7fd1ba81f700  1 --
> >>>> 194.109.43.12:6800/67277 <
> >>>> == osd.2 194.109.43.13:6802/45791 1 ==== osd_op_reply(3 mds_anchortable
> >>>> [read 0~
> >>>> 0] ack = -2 (No such file or directory)) v4 ==== 114+0+0 (3107677671 0
> >>>> 0) 0x1e4d
> >>>> e00 con 0x1e5ddc0
> >>>>      0> 2013-09-10 19:35:02.805611 7fd1ba81f700 -1 mds/MDSTable.cc: In
> >>>> function
> >>>> 'void MDSTable::load_2(int, ceph::bufferlist&, Context*)' thread
> >>>> 7fd1ba81f700 ti
> >>>> me 2013-09-10 19:35:02.803673
> >>>> mds/MDSTable.cc: 152: FAILED assert(r >= 0)
> >>>>
> >>>>  ceph version 0.67.3 (408cd61584c72c0d97b774b3d8f95c6b1b06341a)
> >>>>  1: (MDSTable::load_2(int, ceph::buffer::list&, Context*)+0x44f)
> >>>>[0x77ce7f]
> >>>>  2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe3b) [0x7d891b]
> >>>>  3: (MDS::handle_core_message(Message*)+0x987) [0x56f527]
> >>>>  4: (MDS::_dispatch(Message*)+0x2f) [0x56f5ef]
> >>>>  5: (MDS::ms_dispatch(Message*)+0x19b) [0x5710bb]
> >>>>  6: (DispatchQueue::entry()+0x592) [0x92e432]
> >>>>  7: (DispatchQueue::DispatchThread::entry()+0xd) [0x8a59bd]
> >>>>  8: (()+0x68ca) [0x7fd1bed298ca]
> >>>>  9: (clone()+0x6d) [0x7fd1bda5cb6d]
> >>>>  NOTE: a copy of the executable, or `objdump -rdS <executable>` is
> >>>> needed to interpret this.
> >>>>
> >>>> When trying to mount CephFS, it just hangs now.  Sometimes, an MDS
> >>>>stays
> >>>> up for a while, but will eventually crash again.  This CephFS was
> >>>> created on 0.67 and I haven't done anything but mount and use it under
> >>>> very light load in the mean time.
> >>>>
> >>>> Any ideas, or if you need more info, let me know.  It would be nice to
> >>>> get my data back, but I have backups too.
> >>>
> >>> Does the filesystem have any data in it? Every time we've seen this
> >>> error it's been on an empty cluster which had some weird issue with
> >>> startup.
> >>
> >>This one certainly had some data on it, yes.  A couple of 100's of GBs
> >>of disk-images and a couple of trees of smaller files.  Most of them
> >>accessed very rarely since being copied on.
> >>
> >>
> >>   Regards,
> >>
> >>      Oliver
> >>_______________________________________________
> >>ceph-users mailing list
> >>ceph-users@xxxxxxxxxxxxxx
> >>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com