Re: OSD down

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Daniel,

When I encounter an OSD which I can start, but which then stops on its own after running for some period of time, then root cause has generally been sectors pending reallocation on the hard drive the OSD is using. The OSD will run fine until it attempts to read from the bad disk sectors and then it produces a read error and drops offline.

You can check the disk using smartmon-tools, and if there are sectors pending reallocation, remove the OSD from the cluster, use dd to write zeros over the drive (this will cause the drive to reallocate spare sectors to replace the bad sectors), then re-add the OSD to the cluster.

-Steve

On 02/05/2015 08:19 AM, Daniel Takatori Ohara wrote:
Hello Alex,

Thank's for the answer.

In the server's, i use CentOS 6.6 with kernel 2.6.32, and in the clients i use Ubuntu 14 with kernel 3.16.

And the version of the Ceph is 0.87.

Thank's,

Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center 
Instituto Sírio-Libanês de Ensino e Pesquisa
Hospital Sírio-Libanês
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
São Paulo-SP. 01308-060


On Thu, Feb 5, 2015 at 10:43 AM, Alexis KOALLA <alexis.koalla@xxxxxxxxxx> wrote:
Hi Daniel
Could you be more precise on your issue please?
What is the OS under which your ceph is running and what is the ceph version you are currently running?

Anyway, I have exeprienced an issue that looks like yours.
I have  installed and configured a small cluster "microceph" on my PC  for quick demo. AOn this cluster I have 4 OSDs and 1 MON . There is no MDS.
I have written a script that starts the cluster.
In this script I start the monitor: ceph-mon -c /path/to/yourceph/confile -i <mon_id>
I also start manually the 4 OSD like this :ceph-osd -c /path/to/yourceph/confile -i <osd_id>

I also forced the OSD to be "in" after the start.
Right now it works fine.But I don't think it's the right ay to process(start manually the OSD and putting them in )
May be it can give you an idea where to start investigation.

Regards
Alex


Le 05/02/2015 11:29, Daniel Takatori Ohara a écrit :
Hi, anyone help me please.

I have a cluster with 4 OSD's, 1 MDS and 1 MON.

The osd.3 was down, and i need restart in the host with the command /etc/init.d/ceph restart osd.3.

The osd.0 is marked down sometimes, but he is marked up automatically.

[ceph@ceph-admin my-cluster]$ ceph osd tree
# id    weight  type name       up/down reweight
-1      50.63   root default
-2      13.84           host ceph-osd1
0       13.84                   osd.0   up      1
-3      14.76           host ceph-osd2
1       14.76                   osd.1   up      1
-4      22.03           host ceph-osd3
2       10.09                   osd.2   up      0.8
3       11.94                   osd.3   down    0

Anyone, can help me, please?

Thank's,

Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center 
Instituto Sírio-Libanês de Ensino e Pesquisa
Hospital Sírio-Libanês
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
São Paulo-SP. 01308-060



_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--

logo Orange

Alexis KOALLA

Orange/IMT/OLPS/ASE/DAPI/CSE

Spécialiste en Technologies/Cloud Storage Services & Plateformes

Specialist  in Technologies/Cloud Storage Services & Platforms

Tel :+33(0) 299 124 939 / +33 670 698 929
alexis.koalla@xxxxxxxxxx




_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma310@xxxxxxxxxx

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux