On Fri, Jul 28, 2017 at 9:42 PM, Peter Maloney <peter.maloney@xxxxxxxxxxxxxxxxxxxx> wrote: > Hello Dan, > > Based on what I know and what people told me on IRC, this means basicaly the > condition that the osd is not acting nor up for any pg. And for one person > (fusl on irc) that said there was a unfound objects bug when he had size = > 1, also he said if reweight (and I assume crush weight) is 0, it will surely > be safe, but possibly it won't be otherwise. > > And so here I took my bc-ceph-reweight-by-utilization.py script that already > parses `ceph pg dump --format=json` (for up,acting,bytes,count of pgs) and > `ceph osd df --format=json` (for weight and reweight), and gutted out the > unneeded parts, and changed the report to show the condition I described as > True or False per OSD. So the ceph auth needs to allow ceph pg dump and ceph > osd df. The script is attached. > > The script doesn't assume you're ok with acting lower than size, or care > about min_size, and just assumes you want the OSD completely empty. Thanks for this script. In fact, I am trying to use the min_size/size-based removal heuristics. If we would be able to wait until an OSD is completely empty, then I suppose could just set the crush weight to 0 then wait for HEALTH_OK. For our procedures I'm trying to shortcut this with an earlier device removal. Cheers, Dan > > Sample output: > > Real cluster: > > root@cephtest:~ # ./bc-ceph-empty-osds.py -a > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new empty > 0 4.00099 0.61998 38 1221853911536 38 1221853911536 False > 1 4.00099 0.59834 43 1168531341347 43 1168531341347 False > 2 4.00099 0.79213 44 1155260814435 44 1155260814435 False > 27 4.00099 0.69459 39 1210145117377 39 1210145117377 False > 30 6.00099 0.73933 56 1691992924542 56 1691992924542 False > 31 6.00099 0.81180 64 1810503842054 64 1810503842054 False > ... > > > Test cluster with some -nan and 0's in crush map: > > root@tceph1:~ # ceph osd df > ID WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS > 4 1.00000 0 0 0 0 -nan -nan 0 > 1 0.06439 1.00000 61409M 98860k 61313M 0.16 0.93 47 > 0 0.06438 1.00000 61409M 134M 61275M 0.22 1.29 59 > 2 0.06439 1.00000 61409M 82300k 61329M 0.13 0.77 46 > 3 0 0 0 0 0 -nan -nan 0 > TOTAL 179G 311M 179G 0.17 > MIN/MAX VAR: 0.77/1.29 STDDEV: 0.04 > > > root@tceph1:~ # ./bc-ceph-empty-osds.py > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new empty > 3 0.00000 0.00000 0 0 0 0 True > 4 1.00000 0.00000 0 0 0 0 True > root@tceph1:~ # ./bc-ceph-empty-osds.py -a > osd_id weight reweight pgs_old bytes_old pgs_new bytes_new empty > 0 0.06438 1.00000 59 46006167 59 46006167 False > 1 0.06439 1.00000 47 28792306 47 28792306 False > 2 0.06439 1.00000 46 17623485 46 17623485 False > 3 0.00000 0.00000 0 0 0 0 True > 4 1.00000 0.00000 0 0 0 0 True > > > The "old" vs "new" suffixes refer to the position of data now and after > recovery is complete, respectively. (the magic that made my reweight script > efficient compared to the official reweight script) > > And I have not used such a method in the past... my cluster is small, so I > have always just let recovery completely finish instead. I hope you find it > useful and it develops from there. > > Peter > > > On 07/28/17 15:36, Dan van der Ster wrote: > > Hi all, > > We are trying to outsource the disk replacement process for our ceph > clusters to some non-expert sysadmins. > We could really use a tool that reports if a Ceph OSD *would* or > *would not* be safe to stop, e.g. > > # ceph-osd-safe-to-stop osd.X > Yes it would be OK to stop osd.X > > (which of course means that no PGs would go inactive if osd.X were to > be stopped). > > Does anyone have such a script that they'd like to share? > > Thanks! > > Dan > _______________________________________________ > ceph-users mailing list > ceph-users@xxxxxxxxxxxxxx > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > _______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com