Re: OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry to almost double post, I noticed that it seems like one mon is down, but they do actually seem to be ok, the 11 that are in falls out and I am back at 7 healthy OSD:s again:

 

root@black:/var/lib/ceph/mon# ceph -s

    cluster ee8eae7a-5994-48bc-bd43-aa07639a543b

     health HEALTH_WARN

            108 pgs backfill

            37 pgs backfilling

            2339 pgs degraded

            105 pgs down

            237 pgs peering

            138 pgs stale

            765 pgs stuck degraded

            173 pgs stuck inactive

            138 pgs stuck stale

            3327 pgs stuck unclean

            765 pgs stuck undersized

            2339 pgs undersized

            recovery 1612956/6242357 objects degraded (25.839%)

            recovery 772311/6242357 objects misplaced (12.372%)

            too many PGs per OSD (561 > max 350)

            4/11 in osds are down

     monmap e3: 3 mons at {black=172.16.0.201:6789/0,orange=172.16.0.203:6789/0,purple=172.16.0.202:6789/0}

            election epoch 456, quorum 0,1,2 black,purple,orange

     mdsmap e5: 0/0/1 up

     osdmap e35627: 12 osds: 7 up, 11 in; 1201 remapped pgs

      pgmap v8215121: 4608 pgs, 3 pools, 11897 GB data, 2996 kobjects

            17203 GB used, 8865 GB / 26069 GB avail

            1612956/6242357 objects degraded (25.839%)

            772311/6242357 objects misplaced (12.372%)

                2137 active+undersized+degraded

                1052 active+clean

                 783 active+remapped

                 137 stale+active+undersized+degraded

                 104 down+peering

                 102 active+remapped+wait_backfill

                  66 remapped+peering

                  65 peering

                  33 active+remapped+backfilling

                  27 activating+undersized+degraded

                  26 active+undersized+degraded+remapped

                  25 activating

                  16 remapped

                  14 inactive

                   7 activating+remapped

                   6 active+undersized+degraded+remapped+wait_backfill

                   4 active+undersized+degraded+remapped+backfilling

                   2 activating+undersized+degraded+remapped

                   1 down+remapped+peering

                   1 stale+remapped+peering

recovery io 22108 MB/s, 5581 objects/s

  client io 1065 MB/s rd, 2317 MB/s wr, 11435 op/s

 

From: ceph-users [mailto:ceph-users-bounces@xxxxxxxxxxxxxx] On Behalf Of Claes Sahlström
Sent: den 15 november 2015 21:56
To: ceph-users@xxxxxxxxxxxxxx
Subject: OSD:s failing out after upgrade to 9.2.0 on Ubuntu 14.04

 

Hi,

 

I have a problem I hope is possible to solve…

 

I upgraded to 9.2.0 a couple of days back and I missed this part:

If your systems already have a ceph user, upgrading the package will cause problems. We suggest you first remove or rename the existing ‘ceph’ user and ‘ceph’ group before upgrading.

 

I guess that might be the reason why my OSD:s has started to die on me.

 

I can get the osd-services when having the file permissions as root:root  and using:

setuser match path = /var/lib/ceph/$type/$cluster-$i

 

I am really not sure where to look to find out what is wrong.

 

First when I had upgraded and the OSD:s were restarted then I got a permission denied on the ods-directories and that was solve then adding the “setuser match” in ceph.conf.

 

With 5 of 12 OSD:s down I am starting to worry and since I only have one replica I might lose som data. As I mentioned the OSD-services start and “ceph osd in” does not give me any error but the OSD never comes up.

 

Any suggestions or helpful tips are most welcome,

 

/Claes

 

 

 

 

 

 

ID WEIGHT   TYPE NAME       UP/DOWN REWEIGHT PRIMARY-AFFINITY

-1 24.00000 root default

-2  8.00000     host black

3  2.00000         osd.3        up  1.00000          1.00000

2  2.00000         osd.2        up  1.00000          1.00000

0  2.00000         osd.0        up  1.00000          1.00000

1  2.00000         osd.1        up  1.00000          1.00000

-3  8.00000     host purple

7  2.00000         osd.7      down        0          1.00000

6  2.00000         osd.6        up  1.00000          1.00000

4  2.00000         osd.4        up  1.00000          1.00000

5  2.00000         osd.5        up  1.00000          1.00000

-4  8.00000     host orange

11  2.00000         osd.11     down        0          1.00000

10  2.00000         osd.10     down        0          1.00000

8  2.00000         osd.8      down        0          1.00000

9  2.00000         osd.9      down        0          1.00000

 

 

 

 

 

 

root@black:/var/log/ceph# ceph -s

2015-11-15 21:55:27.919339 7ffb38446700  0 -- :/1336310814 >> 172.16.0.203:6789/0 pipe(0x7ffb34064550 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7ffb3405e000).fault

    cluster ee8eae7a-5994-48bc-bd43-aa07639a543b

     health HEALTH_WARN

            1591 pgs backfill

            38 pgs backfilling

            2439 pgs degraded

            105 pgs down

            106 pgs peering

            138 pgs stale

            2439 pgs stuck degraded

            106 pgs stuck inactive

            138 pgs stuck stale

            2873 pgs stuck unclean

            2439 pgs stuck undersized

            2439 pgs undersized

            recovery 1694156/6668499 objects degraded (25.405%)

            recovery 2315800/6668499 objects misplaced (34.727%)

            too many PGs per OSD (1197 > max 350)

            1 mons down, quorum 0,1 black,purple

     monmap e3: 3 mons at {black=172.16.0.201:6789/0,orange=172.16.0.203:6789/0,purple=172.16.0.202:6789/0}

            election epoch 448, quorum 0,1 black,purple

     mdsmap e5: 0/0/1 up

     osdmap e34098: 12 osds: 7 up, 7 in; 2024 remapped pgs

      pgmap v8211622: 4608 pgs, 3 pools, 12027 GB data, 3029 kobjects

            17141 GB used, 8927 GB / 26069 GB avail

            1694156/6668499 objects degraded (25.405%)

            2315800/6668499 objects misplaced (34.727%)

                1735 active+clean

                1590 active+undersized+degraded+remapped+wait_backfill

                 637 active+undersized+degraded

                 326 active+remapped

                 137 stale+active+undersized+degraded

                 101 down+peering

                  38 active+undersized+degraded+remapped+backfilling

                  37 active+undersized+degraded+remapped

                   4 down+remapped+peering

                   1 stale+remapped+peering

                   1 active

                   1 active+remapped+wait_backfill

recovery io 66787 kB/s, 16 objects/s

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux