Re: Replacing a failed disk/OSD: unfound object

Meng Zhao <mzhao@xxxxxxxxxxxx> · Thu, 14 Jul 2011 11:28:36 +0800

On Wed, 13 Jul 2011 09:19:40 -0700, Tommi Virtanen wrote:
On Wed, Jul 13, 2011 at 03:15, Meng Zhao <mzhao@xxxxxxxxxxxx> wrote:
active+clean; 349 MB data, 1394 MB used, 408 MB / 2046 MB avail; 
49/224
degraded (21.875%)
=>for some reason osd2 failed during object replication

If you lose osds while in degraded mode, you very much can lose
objects permanently. Degraded means the replication has not 
completed.
It's like losing a second disk in a RAID5 before it has healed, 
though
the scope of the loss is individual objects not the whole filesystem.

Thanks for going through the long log.

Please note that the system was considered clean (by ceph -w) before 
osd0 shutdown.
2011-07-13 15:01:17.355846    pg v1099: 602 pgs: 602 active+clean; 349 
MB data, 1778 MB used, 920 MB / 3069 MB avail
The degrading happens after 5min time out when osd0 is considered out.
2011-07-13 16:18:03.746935    pg v1104: 602 pgs: 233 active+clean, 369 
active+clean+degraded; 349 MB data, 1795 MB used, 910 MB / 3069 MB 
avail; 67/224 degraded (29.911%)
and halfway into the replication, osd2 is considered out.
But the osd2 log show that, it was in a bad stated half an hour ago, 
but the ceph system did not escalate that info and take any action until 
replication action start and eventually crash.
In other words, its too late when cumulatively more than one ods fail.  
It seems that a self-diagnose mechanism is needed for osd to self check 
periodically.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html