Re: MDS stuck in a crash loop

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 22 Oct 2015, John Spray wrote:
> On Thu, Oct 22, 2015 at 1:43 PM, Milosz Tanski <milosz@xxxxxxxxx> wrote:
> > On Wed, Oct 21, 2015 at 5:33 PM, John Spray <jspray@xxxxxxxxxx> wrote:
> >> On Wed, Oct 21, 2015 at 10:33 PM, John Spray <jspray@xxxxxxxxxx> wrote:
> >>>> John, I know you've got
> >>>> https://github.com/ceph/ceph-qa-suite/pull/647. I think that's
> >>>> supposed to be for this, but I'm not sure if you spotted any issues
> >>>> with it or if we need to do some more diagnosing?
> >>>
> >>> That test path is just verifying that we do handle dirs without dying
> >>> in at least one case -- it passes with the existing ceph code, so it's
> >>> not reproducing this issue.
> >>
> >> Clicked send to soon, I was about to add...
> >>
> >> Milosz mentioned that they don't have the data from the system in the
> >> broken state, so I don't have any bright ideas about learning more
> >> about what went wrong here unfortunately.
> >>
> >
> > Sorry about that, wasn't thinking at the time and just wanted to get
> > this up and going as quickly as possible :(
> >
> > If this happens next time I'll be more careful to keep more evidence.
> > I think multi-fs in the same rados namespace support would actually
> > helped here, since it makes it easier to create a newfs and leave the
> > other one around (for investigation)
> 
> Yep, good point.  I am a known enthusiast for multi-filesystem support :-)

A rados pool export on the metadata pool would have helped, too.  That 
doesn't include data object backtrace metadata, though.  I wonder if we 
should make a cephfs metadata imager tool to capture the metadata state of 
the file system (similar to the tools that are available for xfs) that 
captures both.  On the data pool side it'd just record the object names, 
xattrs, and object size, ignoring the data.

It wouldn't anonymize filenames (that is tricky without breaking the mds 
dir hashing), but it excludes data and would probably be 
sufficient for most users...

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux