Re: [ceph-users] Running on disks that lose their head

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 6 Nov 2013, Loic Dachary wrote:
> Hi Ceph,
> 
> People from Western Digital suggested ways to better take advantage of 
> the disk error reporting. They gave two examples that struck my 
> imagination. First there are errors that look like the disk is dying ( 
> read / write failures ) but it's only a transient problem and the driver 
> should be able to make the difference by properly interpreting the 
> available information. They said that the prolonged life you get if you 
> don't decommission a disk that only has a transient error is 

This make me think we really need to build or integrate with some generic 
SMART reporting infrastructure so that we can identify disks that are 
failing or going to fail.  What to do with that information is another 
question; initially I would lean toward just marking the disk out, but 
there may be smarter alternatives to investigate.

> significant. The second example is when one head out of ten fails : 
> disks can keep working with the nine remaining heads. Losing 1/10 of the 
> disk is likely to result in a full re-install of the Ceph osd. But, 
> again, the disk could keep going after that, with 9/10 of its original 
> capacity. And Ceph is good at handling osd failures.

Yeah...but if you lose 1/10 of a block device any existing local file 
system is going to blow up.  I suspet this is something that newgangled 
interfaces like Kinetic will be much better at.  Even then, though, it is 
challenging for anything sitting above to cope with losing some random 
subset of it's data underneath.  To a first approximation, for this to be 
useful, the fs and disk would need to keep, say, all teh data in a 
particular PG confined to a single platter, so that when a head goes the 
other PGs are still fully intact and usage.  It is probably a long way to 
get from here to there...

> All this is news to me and sounds really cool. But I'm sure there are 
> people who already know about it and I'm eager to hear their opinion :-)
> 
> Cheers
> 
> -- 
> Lo?c Dachary, Artisan Logiciel Libre
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux