Re: osd: terminate called after throwing an instance of 'std::bad_alloc'

Sage Weil <sage@xxxxxxxxxxxx> · Tue, 8 Jun 2010 22:54:35 -0700 (PDT)

Hmm, okay.  Unfortunately the logs don't have any clues.  I'm giving on up 
solving the mystery this time around.  You can go ahead and stop all the 
daemons and re-run mkcephfs. 

I've added a 'scripts/check_pglog.sh $osddatadir' script that will just 
look for any corruption.  Running that periodically will let you verify 
that you haven't hit the same corruption without having to restart cosd.  
Ideally we can figure out what kind of workloads are triggering the 
problem, and then reproduce it with sufficient logging enabled to find 
where the race is taking place.

If you have any details about the workload or any failure/recovery 
activity that may have been going on at the time that may shed some light 
on it...

FYI, this is http://tracker.newdream.net/issues/114

Thanks!
sage

On Tue, 8 Jun 2010, Andre Noll wrote:

> On Mon, Jun 07, 10:20, Sage Weil wrote:
> 
> > Do you have osd logs?
> 
> Should all be there. At least I did not remove anything.
> 
> > This is the same corruption I've seen previously, but I've just
> > reaudited the code I suspect and it looks ok.  Some insight into what
> > happened to the cluster would help.  Which osd is it?
> 
> It's osd6 running on node142. I was running osd from ceph-v0.20.2 but
> also tried the osd compiled from the "testing" branch of the git repo.
> 
> > Do you still have the logs (/var/log/ceph/osd$n*)?  The osdmap
> > sequence (tarball of $mon_data/osdmap) would be helpful too.
> 
> I created a tarball of the full /var/log/ceph directory of node142
> (which is only a storage node) and will send it to you off-line.
> 
> We have three monitors. I'll send a tarball of /var/ceph/mon0/monmap/
> of mon0 (node141) as well.
> 
> Thanks
> Andre
> -- 
> The only person who always got his work done by Friday was Robinson Crusoe
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html