Re: Another osd is filled too full and taken off after manually taking one osd out

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jun 18, 2013 at 08:13:39PM +0800, Da Chun wrote:
> Hi List,My ceph cluster has two osds on each node. One has 15g capacity, and the other 10g.
> It's interesting that, after I took the 15g osd out of the cluster, the cluster started to rebalance, and finally the 10g osd on the same node was finally full and taken off, and failed to start again with the following error in the osd log file:
> 2013-06-18 19:51:20.799756 7f6805ee07c0 -1 filestore(/var/lib/ceph/osd/ceph-1) Extended attributes don't appear to work. Got error (28) No space left on device. If you are using ext3 or ext4, be sure to mount the underlying file system with the 'user_xattr' option.
> 2013-06-18 19:51:20.800258 7f6805ee07c0 -1 ^[[0;31m ** ERROR: error converting store /var/lib/ceph/osd/ceph-1: (95) Operation not supported^[[0m
> 
> 
> 
> I guess the 10g osd was chosen by the cluster to be the container for the extra objects.
> My questions here:
> 1. How are the extra objects spread in the cluster after an osd is taken out? Only spread to one of the osds?
> 2. Is there no mechanism to prevent the osds from being filled too full and taken off?
> 

As far I understand it.

Each OSD has the same weight by default, you can give them a different weight to force it to be used less.

The reason to do so could be because it has less space or because it is slower.

> 
> Thanks for your time!
> 
> 
> This is the ceph log:
> 2013-06-18 19:26:41.567607 mon.0 172.18.46.34:6789/0 1599 : [INF] pgmap v14182: 456 pgs: 453 active+clean, 3 active+remapped+backfilling; 16874 MB data, 40220 MB used, 36513 MB / 76733 MB avail; 379/9761 degraded (3.883%);  recovering 19 o/s, 77608KB/s
> 2013-06-18 19:26:42.649139 mon.0 172.18.46.34:6789/0 1600 : [INF] pgmap v14183: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 309/9745 degraded (3.171%);  recovering 41 o/s, 162MB/s
> 2013-06-18 19:26:46.566721 mon.0 172.18.46.34:6789/0 1601 : [INF] pgmap v14184: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 250/9745 degraded (2.565%);  recovering 25 o/s, 101450KB/s
> 2013-06-18 19:26:39.858833 osd.1 172.18.46.35:6801/10730 88 : [WRN] OSD near full (91%)
> 2013-06-18 19:26:48.548076 mon.0 172.18.46.34:6789/0 1602 : [INF] pgmap v14185: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 200/9745 degraded (2.052%);  recovering 18 o/s, 72359KB/s
> 2013-06-18 19:26:51.898811 mon.0 172.18.46.34:6789/0 1603 : [INF] pgmap v14186: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 155/9745 degraded (1.591%);  recovering 17 o/s, 71823KB/s
> 2013-06-18 19:26:53.947739 mon.0 172.18.46.34:6789/0 1604 : [INF] pgmap v14187: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 113/9745 degraded (1.160%);  recovering 16 o/s, 65041KB/s
> 2013-06-18 19:26:57.293713 mon.0 172.18.46.34:6789/0 1605 : [INF] pgmap v14188: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 40222 MB used, 36511 MB / 76733 MB avail; 103/9745 degraded (1.057%);  recovering 9 o/s, 37353KB/s
> 2013-06-18 19:27:03.861124 mon.0 172.18.46.34:6789/0 1606 : [INF] pgmap v14189: 456 pgs: 454 active+clean, 2 active+remapped+backfilling; 16874 MB data, 35598 MB used, 41134 MB / 76733 MB avail; 103/9745 degraded (1.057%);  recovering 1 o/s, 3532KB/s
> 2013-06-18 19:27:13.732263 mon.0 172.18.46.34:6789/0 1607 : [DBG] osd.1 172.18.46.35:6801/10730 reported failed by osd.0 172.18.46.34:6804/1506
> 2013-06-18 19:27:15.949395 mon.0 172.18.46.34:6789/0 1608 : [DBG] osd.1 172.18.46.35:6801/10730 reported failed by osd.3 172.18.46.34:6807/11743
> 2013-06-18 19:27:17.239206 mon.0 172.18.46.34:6789/0 1609 : [DBG] osd.1 172.18.46.35:6801/10730 reported failed by osd.5 172.18.46.36:6806/7436
> 2013-06-18 19:27:17.239404 mon.0 172.18.46.34:6789/0 1610 : [INF] osd.1 172.18.46.35:6801/10730 failed (3 reports from 3 peers after 2013-06-18 19:27:38.239157 >= grace 20.000000)
> 2013-06-18 19:27:17.306958 mon.0 172.18.46.34:6789/0 1611 : [INF] osdmap e647: 6 osds: 5 up, 5 in
> 2013-06-18 19:27:17.387311 mon.0 172.18.46.34:6789/0 1612 : [INF] pgmap v14190: 456 pgs: 335 active+clean, 119 stale+active+clean, 2 active+remapped+backfilling; 16874 MB data, 35598 MB used, 41134 MB / 76733 MB avail; 103/9745 degraded (1.057%)
> 2013-06-18 19:27:18.308209 mon.0 172.18.46.34:6789/0 1613 : [INF] osdmap e648: 6 osds: 5 up, 5 in
> 2013-06-18 19:27:18.316487 mon.0 172.18.46.34:6789/0 1614 : [INF] pgmap v14191: 456 pgs: 335 active+clean, 119 stale+active+clean, 2 active+remapped+backfilling; 16874 MB data, 35598 MB used, 41134 MB / 76733 MB avail; 103/9745 degraded (1.057%)
> 2013-06-18 19:27:22.676915 mon.0 172.18.46.34:6789/0 1615 : [INF] pgmap v14192: 456 pgs: 280 active+clean, 79 stale+active+clean, 1 active+remapped, 1 active+remapped+backfilling, 95 active+degraded; 16874 MB data, 35596 MB used, 41137 MB / 76733 MB avail; 318/9334 degraded (3.407%);  recovering 0 o/s, 762KB/s
> 2013-06-18 19:27:23.766125 mon.0 172.18.46.34:6789/0 1616 : [INF] pgmap v14193: 456 pgs: 162 active+clean, 2 active+remapped, 292 active+degraded; 16874 MB data, 35612 MB used, 41121 MB / 76733 MB avail; 15EB/s rd, 0op/s; 2031/8972 degraded (22.637%);  recovering 15E o/s, 15EB/s
> 2013-06-18 19:29:03.896056 mon.0 172.18.46.34:6789/0 1617 : [INF] pgmap v14194: 456 pgs: 162 active+clean, 2 active+remapped, 292 active+degraded; 16874 MB data, 35612 MB used, 41121 MB / 76733 MB avail; 15EB/s rd, 0op/s; 2031/8972 degraded (22.637%);  recovering 15E o/s, 15EB/s
> 2013-06-18 19:29:22.700301 mon.0 172.18.46.34:6789/0 1618 : [INF] pgmap v14195: 456 pgs: 162 active+clean, 2 active+remapped, 292 active+degraded; 16874 MB data, 35615 MB used, 41118 MB / 76733 MB avail; 2031/8972 degraded (22.637%)
> 2013-06-18 19:29:23.759014 mon.0 172.18.46.34:6789/0 1619 : [INF] pgmap v14196: 456 pgs: 162 active+clean, 2 active+remapped, 292 active+degraded; 16874 MB data, 35596 MB used, 41137 MB / 76733 MB avail; 2031/8972 degraded (22.637%)
> 2013-06-18 19:31:03.932470 mon.0 172.18.46.34:6789/0 1620 : [INF] pgmap v14197: 456 pgs: 162 active+clean, 2 active+remapped, 292 active+degraded; 16874 MB data, 35596 MB used, 41137 MB / 76733 MB avail; 2031/8972 degraded (22.637%)
> 2013-06-18 19:32:18.012211 mon.0 172.18.46.34:6789/0 1621 : [INF] osd.1 out (down for 300.715725)

> _______________________________________________
> ceph-users mailing list
> ceph-users@xxxxxxxxxxxxxx
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux