Re: The stability of OSD

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Sage Weil wrote:
On Mon, 28 Mar 2011, Sylar Shen wrote:
Hi,
I set an environment of 20 servers which include 2 MDSs, 3 MONs and 18
OSDes(3 monitors on 18 OSDes)
My version is 0.24.3 and OS is Fedora 14.
There's a problem when I was doing the writing tests.
Whether I was writing the data or not, some OSDes were randomly marked
down and out one by one after a period of time.
And when that happened, the whole performance soon got worse and worse.
I checked the /var/log/ceph/osd.log but found nothing.
So I am curious that is there anyone who has the same problem with me?
Or maybe it's just a problem of my hardware......><

Hi Sylar,

This is/was a known problem. There's a long thread from a couple weeks back with Jim Schutt debugging the issue. We've fixed a few different things that have significantly improved the situation, but the heartbeats are still failing from time to time.

I'm about ready to test the master branch again, built
against tcmalloc and libatomic_ops - I had to add those
things into my RHEL 5 environment.  I'll be reporting
results for that soon; sorry for the delay.

-- Jim


I suspect using a more recent release will be sufficient at your scale, either 0.25.2 or the latest 'next' branch from git (there are autobuilt debs for that too). You can also increase the 'osd heartbeat grace' to make the system less sensitive to the transient hangs that are preventing the heartbeats from going out.

Please let us know what you find, either here or on #ceph.

Thanks!
sage

	

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux