Re: mds crash on snaptest-2

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jul 27, 2010 at 2:22 AM, Thomas Mueller <thomas@xxxxxxxxxxxxxx> wrote:
> Am Mon, 19 Jul 2010 14:16:14 -0700 schrieb Gregory Farnum:
>
>> Can you turn on debugging and verify for me that it's crashing on
>> "assert(p->second.first <= snapid && snapid <= p->first); "
>> CInode::encode_inodestat:1617?
>> I've hit this assert trying to reproduce your issue using cfuse and I
>> think this is it, but I'm hitting some ext3 bugs in my kernel on a
>> fairly regular basis while trying to reproduce, so a fix will need to
>> wait until I've upgraded (tomorrow). :) Thanks!
>> -Greg
>
> hi greg
>
> the test still fails with ceph.git/unstable from today. now cmds doesn't
> exit anymore. But after a half an hour the test kills itself because of a
> timeout (normal running time is about 10 minutes).
>
> - Thomas
>
> PS: found out that vstart.sh places logs in subdir "out" too. so tell me
> if you need some of them.
Yes, I've been working on this for some time now. If you try the test
on a single MDS it should work fine with the latest git, but there are
some deeper issues going on with an MDS cluster that we're having a
hard time isolating in a way that lets us fix it. It appears we might
need to rework our snapshot inode handling a bit and Sage has asked me
to move on.

I'd recommend doing your testing on a single MDS (if using vstart:
CEPH_NUM_MDS=1 ./vstart -- this also works for _OSD and _MON) system
until we say that we expect the MDS cluster to work under more
circumstances.
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux