Re: One osd crashing daily, the problem with osd.50

Christian Balzer <chibi@xxxxxxx> · Mon, 9 May 2016 16:53:07 +0900

Hello,

On Mon, 9 May 2016 09:31:20 +0200 Ronny Aasen wrote:

> hello
> 
> I am running a small lab ceph cluster consisting of 6 old used servers. 

That's larger than quite a few production deployments. ^_-

> they have 36 slots for drives. but too little ram, 32GB max, for this 
> mainboard, to take advantage of them all. When i get to around 20 osd's 
> on a node the OOM killer becomes a problem, if there is incidents that 
> require recovery.
> 
No surprise there, if you're limited to that little RAM I suspect you'd
run out of CPU power with a full load, too.

> In order to remedy some of the ram problems i am running the osd's on 5 
> disk raid5 software sets. this gives me about 7 12TB osd's on a node and 
> a global hotspare. I have tried this on one of the nodes with good 
> success. and I am in the process of doing the migrations on the other 
> nodes as well.
> 
That's optimizing for space and nothing else.

Having done something similar in the past I would strongly recommend the
following:
a) Use RAID6, so that you never have to worry about an OSD failure. 
I've personally lost 2 RAID5 sets of similar size due to double disk
failures.

b) use RAID10 for much improved performance (IOPS). To offset the loss in
space, consider running with a replication of 2, which would be safe, same
for option a).

> i am running on debian jessie using the 0.94.6 hammer from ceph's repo.
> 
> but a issue has started appering on one of these raid5 osd's
> 
> osd.50 have a tendency to stop ~daily with the error message seen in the 
> log below. The osd is running on a healthy software raid5 disk, and i 
> can see nothing in dmesg or any other log that can indicate a problem 
> with this md device.

The key part of that log is EIO failed assert, if you google for 

"FAILED assert(allow_eio" you will get hits from last year, this is FS
issue and has nothing to do with the RAID per se.

Which FS are you using?

If it's not BTRFS and since your other OSDs are not having issues, it
might be worth going over this FS with a fine comb. 

The "near full" OSD is something that you want to address, too.

> once i restart the osd it's up and in and probably stays up and in for 
> some hours upto a few days. the other 6 osd's on this node does not show 
> the same problem. i have restarted this osd about 8-10 times. so it's 
> fairly regular.
> 
Might have to bite the bullet and re-create it if you can't find the issue.

> the raid5 sets are 12TB so i was hoping to be able to fix the problem, 
> rather then zapping the md and recreating from scratch. I was also 
> worrying if there was something fundamentaly wrong about running osd's 
> on software md raid5 devices.
> 
No problem in and by itself, other than reduced performance.

Regards,

Christian
-- 
Christian Balzer        Network/Systems Engineer                
chibi@xxxxxxx   	Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com