Here is the output from ceph osd dump -o - epoch 19 fsid c2ae7ab2-d1b2-a467-be6e-f9a0031840f5 created 2011-04-04 13:27:06.857950 modifed 2011-04-08 14:11:05.899596 flags pg_pool 0 'data' pg_pool(rep pg_size 2 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0) pg_pool 1 'metadata' pg_pool(rep pg_size 2 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0) pg_pool 2 'casdata' pg_pool(rep pg_size 2 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0) pg_pool 3 'rbd' pg_pool(rep pg_size 2 crush_ruleset 3 object_hash rjenkins pg_num 64 pgp_num 64 lpg_num 2 lpgp_num 2 last_change 1 owner 0) max_osd 6 osd0 up in weight 1 up_from 14 up_thru 0 down_at 13 last_clean_interval 10-12 10.6.1.92:6800/17641 10.6.1.92:6801/17641 10.6.1.92:6802/17641 osd1 up in weight 1 up_from 4 up_thru 6 down_at 0 last_clean_interval 0-0 10.6.1.93:6800/31106 10.6.1.93:6801/31106 10.6.1.93:6802/31106 osd2 up in weight 1 up_from 17 up_thru 0 down_at 16 last_clean_interval 15-16 10.6.1.94:6800/2740 10.6.1.94:6803/2740 10.6.1.94:6804/2740 osd3 up in weight 1 up_from 18 up_thru 0 down_at 0 last_clean_interval 0-0 10.6.1.95:6800/32038 10.6.1.95:6801/32038 10.6.1.95:6802/32038 I have not removed any OSDs from the cluster. I created the cluster with a single mds/mon and they have been adding osd slowly. Mark Nigh Systems Architect mnigh@xxxxxxxxxxxxxxx (p) 314.392.6926 -----Original Message----- From: Wido den Hollander [mailto:wido@xxxxxxxxx] Sent: Friday, April 08, 2011 1:38 PM To: Mark Nigh Cc: ceph-devel@xxxxxxxxxxxxxxx Subject: Re: PGs Degraded Hi Mark, On Fri, 2011-04-08 at 12:09 -0500, Mark Nigh wrote: > I have recently built a ceph cluster with the following nodes: > > 2011-04-08 11:54:08.038841 pg v3661: 264 pgs: 264 active+clean+degraded; 9079 MB data, 9234 MB used, 811 GB / 820 GB avail; 2319/4638 degraded (50.000%) > 2011-04-08 11:54:08.039492 mds e17: 2/2/2 up {0=up:active,1=up:active} > 2011-04-08 11:54:08.039529 osd e18: 4 osds: 4 up, 4 in > 2011-04-08 11:54:08.039592 log 2011-04-08 10:08:09.135994 mds0 10.6.1.90:6800/16761 4 : [INF] closing stale session client4142 10.6.1.62:0/667143763 after 304.524869 > 2011-04-08 11:54:08.039673 mon e1: 1 mons at {0=10.6.1.90:6789/0} > That seems odd, your "data" is only 200MB less then "used". What is the replication size for the "data" and "metadata" pool? $ ceph osd dump -o - (rep pg_size) > I have a few files in the cluster (not much data) but have noticed from the beginning of the build (after the 2 osd) that some of my PGs are degraded. > > How do I fix this and is there a tool/command to assist in determining what PGs are degraded? You can view your degraded pg's with: $ ceph pg dump -o - This will tell you the degraded PG's and on which OSD's they are. Did you remove any OSD's from this cluster? It seems very odd that the cluster is in a 50% degraded state. Wido > > Ceph -v is as follows: > > ceph version 0.26 (commit:9981ff90968398da43c63106694d661f5e3d07d5) > > I appreciate the help. > > Mark Nigh > > This transmission and any attached files are privileged, confidential or otherwise the exclusive property of the intended recipient or Netelligent Corporation. If you are not the intended recipient, any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is strictly prohibited. If you have received this transmission in error, please contact us immediately by responding to this message or by telephone (314-392-6900) and promptly destroy the original transmission and its attachments. > -- > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html This transmission and any attached files are privileged, confidential or otherwise the exclusive property of the intended recipient or Netelligent Corporation. If you are not the intended recipient, any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is strictly prohibited. If you have received this transmission in error, please contact us immediately by responding to this message or by telephone (314-392-6900) and promptly destroy the original transmission and its attachments. ÿô.nÇ·®+%˱é¥wÿº{.nÇ·zÿuëø¡Ü}©²ÆzÚj:+v¨þø®w¥þàÞ¨è&¢)ß«a¶Úÿûz¹ÞúÝjÿwèf