[CEPH-LIST]: problem with osd to view up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I try to test ceph 9.2 cluster.

 

My lab have 1 mon and 2 osd with 4 disks each.

 

Only 1 osd server (with 4 disks) are online.

The disks of second osd don't go up ...

 

Some info about environment:

[ceph@OSD1 ~]$ sudo ceph osd tree

ID  WEIGHT  TYPE NAME                     UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 8.00000 root default

-4 8.00000     datacenter dc1

-5 8.00000         room room1

-6 8.00000             row row1

-7 4.00000                 rack rack1

-2 4.00000                     host OSD1

  0 1.00000                         osd.0      up  1.00000          1.00000

  1 1.00000                         osd.1      up  1.00000          1.00000

  2 1.00000                         osd.2      up  1.00000          1.00000

  3 1.00000                         osd.3      up  1.00000          1.00000

-8 4.00000                 rack rack2

-3 4.00000                     host OSD2

  4 1.00000                         osd.4    down  1.00000          1.00000

  5 1.00000                         osd.5    down  1.00000          1.00000

  6 1.00000                         osd.6    down  1.00000          1.00000

  7 1.00000                         osd.7    down  1.00000          1.00000

 

[ceph@OSD1 ceph-deploy]$ sudo ceph osd dump epoch 411 fsid d17520de-0d1e-495b-90dc-f7044f7f165f

created 2015-11-14 06:56:36.017672

modified 2015-12-08 09:48:47.685050

flags nodown

pool 0 'rbd' replicated size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 53 flags hashpspool min_write_recency_for_promote 1 stripe_width 0 max_osd 9

osd.0 up   in  weight 1 up_from 394 up_thru 104 down_at 393 last_clean_interval [388,392) 192.168.64.129:6800/4599 192.168.62.129:6800/4599 192.168.62.129:6801/4599 192.168.64.129:6801/4599 exists,up 499a3624-b2ba-455d-b35a-31d628e1a353

osd.1 up   in  weight 1 up_from 396 up_thru 136 down_at 395 last_clean_interval [390,392) 192.168.64.129:6802/4718 192.168.62.129:6802/4718 192.168.62.129:6803/4718 192.168.64.129:6803/4718 exists,up d7933117-0056-4c3c-ac63-2ad300495e3f

osd.2 up   in  weight 1 up_from 400 up_thru 136 down_at 399 last_clean_interval [392,392) 192.168.64.129:6806/5109 192.168.62.129:6806/5109 192.168.62.129:6807/5109 192.168.64.129:6807/5109 exists,up 7d820897-8d49-4142-8c58-feda8bb04749

osd.3 up   in  weight 1 up_from 398 up_thru 136 down_at 397 last_clean_interval [386,392) 192.168.64.129:6804/4963 192.168.62.129:6804/4963 192.168.62.129:6805/4963 192.168.64.129:6805/4963 exists,up 96270d9d-ed95-40be-9ae4-7bf66aedd4d8

osd.4 down out weight 0 up_from 34 up_thru 53 down_at 58 last_clean_interval [0,0) 192.168.64.130:6800/3615 192.168.64.130:6801/3615 192.168.64.130:6802/3615 192.168.64.130:6803/3615 autoout,exists 6364d590-62fb-4348-b8fe-19b59cd2ceb3

osd.5 down out weight 0 up_from 145 up_thru 151 down_at 203 last_clean_interval [39,54) 192.168.64.130:6800/2784 192.168.62.130:6800/2784 192.168.62.130:6801/2784 192.168.64.130:6801/2784 autoout,exists aa51cdcc-ca9c-436b-b9fc-7bddaef3226d

osd.6 down out weight 0 up_from 44 up_thru 53 down_at 58 last_clean_interval [0,0) 192.168.64.130:6808/4975 192.168.64.130:6809/4975 192.168.64.130:6810/4975 192.168.64.130:6811/4975 autoout,exists 36672496-3346-446a-a617-94c8596e1da2

osd.7 down out weight 0 up_from 155 up_thru 161 down_at 204 last_clean_interval [49,54) 192.168.64.130:6800/2434 192.168.62.130:6800/2434 192.168.62.130:6801/2434 192.168.64.130:6801/2434 autoout,exists 775065fa-8fa8-48ce-a4cc-b034a720fe93

 

All UUID are correct (the down is appear after upgrade).

Now I'm not able to create some osd with ceph-deploy.

I have cancel all osd disk from cluster and deploy from zero.

Now I have some problem for service osd when start with system but now service appear running.

 

[ceph@OSD2 ~]$ sudo systemctl start ceph-osd@4.service

[ceph@OSD2 ~]$ sudo systemctl status ceph-osd@4.service -l ceph-osd@4.service - Ceph object storage daemon

   Loaded: loaded (/etc/systemd/system/ceph.target.wants/ceph-osd@4.service)

   Active: active (running) since Tue 2015-12-08 10:31:38 PST; 9s ago

  Process: 6542 ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS) Main PID: 6599 (ceph-osd)

   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@4.service

           ââ6599 /usr/bin/ceph-osd --cluster ceph --id 4 --setuser ceph --setgroup ceph -f

 

Dec 08 10:31:38 OSD2.local ceph-osd-prestart.sh[6542]: create-or-move updated item name 'osd.4' weight 0.0098 at location {host=OSD2,root=default} to crush map Dec 08 10:31:38 OSD2.local systemd[1]: Started Ceph object storage daemon.

Dec 08 10:31:38 OSD2.local ceph-osd[6599]: starting osd.4 at :/0 osd_data /var/lib/ceph/osd/ceph-4 /dev/sdb1 Dec 08 10:31:38 OSD2.local ceph-osd[6599]: 2015-12-08 10:31:38.702018 7f0a555fc900 -1 osd.4 411 log_to_monitors {default=true}

[ceph@OSD2 ~]$

[ceph@OSD2 ~]$ sudo ceph osd tree

ID  WEIGHT  TYPE NAME                     UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 4.03918 root default

-4 4.03918     datacenter dc1

-5 4.03918         room room1

-6 4.03918             row row1

-7 4.00000                 rack rack1

-2 4.00000                     host OSD1

  0 1.00000                         osd.0      up  1.00000          1.00000

  1 1.00000                         osd.1      up  1.00000          1.00000

  2 1.00000                         osd.2      up  1.00000          1.00000

  3 1.00000                         osd.3      up  1.00000          1.00000

-8 0.03918                 rack rack2

-3 0.03918                     host OSD2

  4 0.00980                         osd.4    down        0          1.00000

  5 0.00980                         osd.5    down        0          1.00000

  6 0.00980                         osd.6    down        0          1.00000

  7 0.00980                         osd.7    down        0          1.00000

 

Script for start osd is:

[root@OSD2 ceph-4]# cat /etc/systemd/system/ceph.target.wants/ceph-osd\@4.service

[Unit]

Description=Ceph object storage daemon

After=network-online.target local-fs.target Wants=network-online.target local-fs.target PartOf=ceph.target

 

[Service]

LimitNOFILE=1048576

LimitNPROC=1048576

EnvironmentFile=-/etc/sysconfig/ceph

Environment=CLUSTER=ceph

ExecStart=/usr/bin/ceph-osd --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph -f ExecStartPre=/usr/libexec/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i ExecReload=/bin/kill -HUP $MAINPID

 

[Install]

WantedBy=ceph.target

 

I see all disk of OSD2 always down …someone call help me in troubleshooting?

Some idea???

I'm crazy for this situation!!

 

Thanks.

Andrea.

 

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux