Re: Disk Failure Predication cloud module?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jake,

Many thanks for contributing the data.

Indeed, our data scientists use the data from Backblaze too.

Have you found strong correlations between device health metrics (such as
reallocated sector count, or any combination of attributes) and read/write
errors in /var/log/messages from what you experienced so far?

How long does it take from the moment you indicate such errors until you
decide to remove the disk?

Thanks,
Yaarit


On Fri, Jan 21, 2022 at 7:14 AM Jake Grimmett <jog@xxxxxxxxxxxxxxxxx> wrote:

> Hi Yaarit,
>
> Thanks for confirming.
>
> telemetry is enabled on our clusters, so are contributing data on ~1270
> disks.
>
> Are you able to use data from backblaze?
>
> Deciding on when an OSD is starting to fail is a dark art, we are still
> hoping that the Disk Failure Predication module will take the guess work
> out of this.
>
> We currently use smartctl to look for disks with outliers in
> Reallocated_Sector_Ct and then look for read or write errors in
> /var/log/messages.
>
> best regards,
>
> Jake
>
>
> On 1/20/22 16:43, Yaarit Hatuka wrote:
> > Hi Jake,
> >
> > diskprediction_cloud module is no longer available in Pacific.
> > There are efforts to enhance the diskprediction module, using our
> > anonymized device telemetry data, which is aimed at building a dynamic,
> > large, diverse, free and open data set to help data scientists create
> > accurate failure prediction models.
> >
> > See more details:
> > https://ceph.io/en/users/telemetry/device-telemetry/
> > <https://ceph.io/en/users/telemetry/device-telemetry/>
> > https://docs.ceph.com/en/latest/mgr/telemetry/
> > <https://docs.ceph.com/en/latest/mgr/telemetry/>
> >
> > Please join these efforts by opting-in to telemetry with:
> > `ceph telemetry on`
> > or with the dashboard's wizard.
> > If for some reason you can not or wish not to opt-it, please share the
> > reason with us.
> >
> > Thanks,
> > Yaarit
> >
> >
> > On Thu, Jan 20, 2022 at 6:39 AM Jake Grimmett <jog@xxxxxxxxxxxxxxxxx
> > <mailto:jog@xxxxxxxxxxxxxxxxx>> wrote:
> >
> >     Dear All,
> >
> >     Is the cloud option for the diskprediction module deprecated in
> Pacific?
> >
> >     https://docs.ceph.com/en/pacific/mgr/diskprediction/
> >     <https://docs.ceph.com/en/pacific/mgr/diskprediction/>
> >
> >     If so, are prophetstor still contributing data to the local module,
> or
> >     is this being updated by someone using data from Backblaze?
> >
> >     Do people find this module useful?
> >
> >     many thanks
> >
> >     Jake
> >
> >     --
> >     Dr Jake Grimmett
> >     Head Of Scientific Computing
> >     MRC Laboratory of Molecular Biology
> >     Francis Crick Avenue,
> >     Cambridge CB2 0QH, UK.
> >
> >     _______________________________________________
> >     ceph-users mailing list -- ceph-users@xxxxxxx
> >     <mailto:ceph-users@xxxxxxx>
> >     To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >     <mailto:ceph-users-leave@xxxxxxx>
> >
>
>
> For help, read https://www.mrc-lmb.cam.ac.uk/scicomp/
> then contact unixadmin@xxxxxxxxxxxxxxxxx
> --
> Dr Jake Grimmett
> Head Of Scientific Computing
> MRC Laboratory of Molecular Biology
> Francis Crick Avenue,
> Cambridge CB2 0QH, UK.
> Phone 01223 267019
> Mobile 0776 9886539
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux