Re: Help build a drive reliability service!

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 24, 2017 at 7:57 PM, Patrick McGarry <pmcgarry@xxxxxxxxxx> wrote:
> Hey cephers,
>
> Just wanted to share the genesis of a new community project that could
> use a few helping hands (and any amount of feedback/discussion that
> you might like to offer).
>
> As a bit of backstory, around 2013 the Backblaze folks started
> publishing statistics about hard drive reliability from within their
> data center for the world to consume. This included things like model,
> make, failure state, and SMART data. If you would like to view the
> Backblaze data set, you can find it at:
>
> https://www.backblaze.com/b2/hard-drive-test-data.html
>
> While most major cloud providers are doing this for themselves
> internally, we would like to replicate/enhance this effort across a
> much wider segment of the population as a free service.  I think we
> have a pretty good handle on the server/platform side of things, and a
> couple of people who have expressed interest in building the
> reliability model (although we could always use more!), what we really
> need is a passionate volunteer who would like to come forward to write
> the agent that sits on the drives, aggregates data, and submits daily
> stats reports via an API (and potentially receives information back as
> results are calculated about MTTF or potential to fail in the next
> 24-48 hrs).
>
> Currently my thinking is to build our collection method based on the
> Backblaze data set so that we can use it to train our model and build
> from going forward. If this sounds like a project you would like to be
> involved in (especially if you're from Backblaze!) please let me know.
> I think a first pass of the agent should be something we can build in
> a couple of afternoons to start testing with a small pilot group that
> we already have available.

I happen to already have written (some time ago) an agent that
collects smart data and posts it to a web service.  It's in golang and
links with a crudely hacked version of smartmontools to gather the
stats.

Any interest?  (hopefully I can find the code...)

John

>
> Happy to entertain any thoughts or feedback that people might have. Thanks!
>
> --
>
> Best Regards,
>
> Patrick McGarry
> Director Ceph Community || Red Hat
> http://ceph.com  ||  http://community.redhat.com
> @scuttlemonkey || @ceph
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]
  Powered by Linux