On Wed, Nov 26, 2014 at 00:26:26 CET, Sven Eschenberg wrote: [...] > > Really, if you are concerned about disks dying (and you should be), > > do the sane thing and have SMART monitoring and regular long SMART > > selftests (I do them every 14 days) via smartmond. If you are concerned > > about the RAID being inconsistent (and you should be), do RAID > > consistency checks (I do them every 7 days) and RAID monitoring via > > mdadm. > > I agree, if I may ask, do you run the consistency check via cron or some > other way? Sure. I basically reverse-engineered the Debian script doing it and wrote a Python script for cron-usage on the basis of that. The mechanics is as follows: 1. make sure /sys/block/<device>/md/mismatch_count is zero 2. write "check" to /sys/block/<device>/md/sync_action 3. wait until /sys/block/<device>/md/sync_action does not read back as "check\n" anymore 4. check whether /sys/block/<device>/md/mismatch_count is non-zero. Script attached below. Gr"usse, Arno ---- #!/usr/bin/python3.1 # (c) 2010 Arno Wagner arno@xxxxxxxxxxx # Distributed under the GPLv2, see http://www.gnu.org/licenses/gpl-2.0.html # Aim: Runs a consistency check on the md device given # and checks mismatch count afterwards. # Result: Silent on no error, output to stdout in case of error. # Intended for cron-job usage. import fileinput, re, sys, time device = 'md0' # check device exists and is "active" flag = False for l in fileinput.input('/proc/mdstat'): l.rstrip('\n') if re.search('^'+device+'\s\:\s',l): flag = True break if not flag: print('ERROR: Specified device '+device+' not found in /proc/mdstat') sys.exit(1) if not re.search('^'+device+'\s\:\sactive\s',l): print('ERROR: Device '+device+' not active:') print(' '+ l) sys.exit(1) # assemble paths for check action_p = '/sys/block/'+device+'/md/sync_action' mismatch_p = '/sys/block/'+device+'/md/mismatch_cnt' # make sure no mismatches are present f = open(mismatch_p, 'r') mism = f.readline() f.close() mism = int(mism) if mism != 0: print('ERROR: Device '+device+' has mismatches before check: '+ repr(mism)) sys.exit(1) # start the check f = open(action_p, 'w') f.write('check') f.close # wait for the check to complete while(1): f = open(action_p, 'r') l = f.readline() f.close() time.sleep(1) if l != 'check\n': break # check mismatches f = open(mismatch_p, 'r') mism = f.readline() f.close() mism = int(mism) if mism != 0: print('ERROR: Device '+device+' has mismatches: '+ repr(mism)) sys.exit(1) else: pass # print(' Device '+device+' has no mismatches') -- Arno Wagner, Dr. sc. techn., Dipl. Inform., Email: arno@xxxxxxxxxxx GnuPG: ID: CB5D9718 FP: 12D6 C03B 1B30 33BB 13CF B774 E35C 5FA1 CB5D 9718 ---- A good decision is based on knowledge and not on numbers. -- Plato If it's in the news, don't worry about it. The very definition of "news" is "something that hardly ever happens." -- Bruce Schneier _______________________________________________ dm-crypt mailing list dm-crypt@xxxxxxxx http://www.saout.de/mailman/listinfo/dm-crypt