Re: v0.61.6 Cuttlefish update released

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 07/25/2013 11:20 AM, peter@xxxxxxxxx wrote:
On 2013-07-25 12:08, Wido den Hollander wrote:
On 07/25/2013 12:01 PM, peter@xxxxxxxxx wrote:
On 2013-07-25 11:52, Wido den Hollander wrote:
On 07/25/2013 11:46 AM, peter@xxxxxxxxx wrote:
Any news on this? I'm not sure if you guys received the link to the
log
and monitor files. One monitor and osd is still crashing with the
error
below.

I think you are seeing this issue: http://tracker.ceph.com/issues/5737

You can try with new packages from here:
http://gitbuilder.ceph.com/ceph-deb-precise-x86_64-basic/ref/wip-5737-cuttlefish/



That should resolve it.

Wido

Hi Wido,

This is the same issue I reported earlier with 0.61.5. I applied the
above package and the problem was solved. Then 0.61.6 was released with
a fix for this issue. I installed 0.61.6 and the issue is back on one of
my monitors and I have one osd crashing. So, it seems the bug is still
there in 0.61.6 or it is a new bug. It seems the guys from Inktank
haven't picked this up yet.


It has been picked up, Sage mentioned this yesterday on the dev list:

"This is fixed in the cuttlefish branch as of earlier this afternoon.
I've spent most of the day expanding the automated test suite to
include upgrade combinations to trigger this and *finally* figured out
that this particular problem seems to surface on clusters that
upgraded from bobtail-> cuttlefish but not clusters created on
cuttlefish.

If you've run into this issue, please use the cuttlefish branch build
for now.  We will have a release out in the next day or so that
includes this and a few other pending fixes.

I'm sorry we missed this one!  The upgrade test matrix I've been
working on today should catch this type of issue in the future."

Wido

Regards,

We created this cluster on cuttlefish and not on bobtail so it doesn't
apply. I'm not sure if it is clear what I am trying to say or that I'm
missing something here but I still see this issue either way :-)

I will check out the dev list also but perhaps someone from Inktank can
at least look at the files I provided.

Peter,

We did take a look at your files (thanks a lot btw!), and as of last night's patches (which are now on the cuttlefish branch), your store worked just fine.

As Sage mentioned on ceph-devel, one of the issues would only happen on a bobtail -> cuttlefish cluster. That is not your issue though. I believe Sage meant the FAILED assert(latest_full > 0) -- i.e., the one reported on #5737.

Your issue however was caused by a bug on a patch meant to fix #5704. It made an on-disk key to be updated erroneously with a value for a version that did not yet existed at the time update_from_paxos() was called. In a nutshell, one of the latest patches (see 115468c73f121653eec2efc030d5ba998d834e43) fixed that issue and another patch (see 27f31895664fa7f10c1617d486f2a6ece0f97091) worked around it.

A point-release should come out soon, but in the mean time the cuttlefish branch should be safe to use.

If you run into any other issues, please let us know.

  -Joao

--
Joao Eduardo Luis
Software Engineer | http://inktank.com | http://ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux