Thomas Byrne - UKRI STFC wrote: : I recently spent some time looking at this, I believe the 'summary' and : 'overall_status' sections are now deprecated. The 'status' and 'checks' : fields are the ones to use now. OK, thanks. : The 'status' field gives you the OK/WARN/ERR, but returning the most : severe error condition from the 'checks' section is less trivial. AFAIK : all health_warn states are treated as equally severe, and same for : health_err. We ended up formatting our single line human readable output : as something like: : : "HEALTH_ERR: 1 inconsistent pg, HEALTH_ERR: 1 scrub error, HEALTH_WARN: 20 large omap objects" Speaking of scrub errors: In previous versions of Ceph, I was able to determine which PGs had scrub errors, and then a cron.hourly script ran "ceph pg repair" for them, provided that they were not already being scrubbed. In Luminous, the bad PG is not visible in "ceph --status" anywhere. Should I use something like "ceph health detail -f json-pretty" instead? Also, is it possible to configure Ceph to attempt repairing the bad PGs itself, as soon as the scrub fails? I run most of my OSDs on top of a bunch of old spinning disks, and a scrub error almost always means that there is a bad sector somewhere, which can easily be fixed by rewriting the lost data using "ceph pg repair". Thanks, -Yenya -- | Jan "Yenya" Kasprzak <kas at {fi.muni.cz - work | yenya.net - private}> | | http://www.fi.muni.cz/~kas/ GPG: 4096R/A45477D5 | This is the world we live in: the way to deal with computers is to google the symptoms, and hope that you don't have to watch a video. --P. Zaitcev _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com