ceph Nautilus: device health management, no infos in: ceph device ls

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I run a ceph Nautilus 14.2.22 cluster with 144 OSDs. In order to be able to see if a disk has hardware trouble and might fail soon I activated health management. The cluster is running on Ubuntu 18.04 and the first task was to install a newer smartctl version. I used smartctl 7.0.

Device monitoring ist activated (ceph device monitoring on). Using ceph device get-health-metrics <device ID> I see the results of smartctl runs for the device with the given ID like this:

....
     "product": "ST4000NM0295",
        "revision": "DT31",
        "rotation_rate": 7200,
        "scsi_error_counter_log": {
            "read": {
                "correction_algorithm_invocations": 20,
                "errors_corrected_by_eccdelayed": 20,
                "errors_corrected_by_eccfast": 3457558131,
....

So this seems to run just fine. For failure prediction I selected the "local" method (ceph config set global device_failure_prediction_mode local).

Whats missing for me is the prediction output in ceph device ls. The column "LIFE EXPECTANCY" is always empty and I have no idea why:

# ceph device ls
DEVICE                            HOST:DEV  DAEMONS LIFE EXPECTANCY
SEAGATE_ST4000NM017A_WS23WKJ4     ceph4:sdb osd.49
SEAGATE_ST4000NM0295_ZC13XK9P     ceph6:sdo osd.92
SEAGATE_ST4000NM0295_ZC141B3S     ceph6:sdj osd.89
....

Anyone an idea what might be missing in my setup? Is the "LIFE EXPECTANCY" perhaps only populated if the local predictor predicts a failure or should I find something like "good" there if the disk is ok for the moment? Recently I even had a disk that died but I did not see anything in ceph-device ls for the died OSD-disk. So I am really unsure if failure prediction is working at all on my ceph system?

Thanks
Rainer

--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse 1
56070 Koblenz, Tel: +49261287 1312 Fax +49261287 100 1312
Web: http://userpages.uni-koblenz.de/~krienke
PGP: http://userpages.uni-koblenz.de/~krienke/mypgp.html
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux