Running the following script on an unimportant tree in the cluster this weekend as a test. So far, so good, and it appears to be doing what I want. Thanks again Pete for the recommendation. -Dan #!/bin/bash BR='-------------------------' UUID=$(/usr/bin/uuidgen) if [ "$UUID" == "" ] then echo "UUID is null" exit 1 fi find "/mnt/blah/" -type f | while read FILE do DNAME=$(dirname "${FILE}") FNAME=$(basename "${FILE}") cd "${DNAME}" if (( $? > 0 )) then echo "Bad cd operation" exit 1 fi pwd mv -v "${FNAME}" "${FNAME}.${UUID}" if (( $? > 0 )) then echo "Bad mv operation" exit 1 fi cp -pv "${FNAME}.${UUID}" "${FNAME}" if (( $? > 0 )) then echo "Bad cp operation" exit 1 fi rm -fv "${FNAME}.${UUID}" if (( $? > 0 )) then echo "Bad rm operation" exit 1 fi echo "${BR}" done On 11 April 2013 22:41, Daniel Mons <daemons at kanuka.com.au> wrote: > Hi Pete, > > Thanks for that link. I'm going to try this en mass on an unimportant > directory over the weekend. > > -Dan > > > On 11 April 2013 01:41, Pete Smith <pete at realisestudio.com> wrote: >> Hi Dan >> >> I've come up against this recently whilst trying to delete large amounts of >> files from our cluster. >> >> I'm resolving it with the method from >> http://comments.gmane.org/gmane.comp.file-systems.gluster.user/1917 >> >> With Fabric as a helping hand, it's not too tedious. >> >> Not sure about the level of glustershd compatibiity, but it's working for >> me. >> >> HTH >> >> Pete >> -- >> >> >> On 10 April 2013 11:44, Daniel Mons <daemons at kanuka.com.au> wrote: >>> >>> Our production GlusterFS 3.3.1GA setup is a 3x2 distribute-replicate, >>> with 100TB usable for staff. This is one of 4 identical GlusterFS >>> clusters we're running. >>> >>> Very early in the life of our production Gluster rollout, we ran >>> Netatalk 2.X to share files with MacOSX clients (due to slow negative >>> lookup on CIFS/Samba for those pesky resource fork files in MacOSX's >>> Finder). Netatalk 2.X wrote it's CNID_DB files back to Gluster, which >>> caused enormous IO, locking up many nodes at a time (lots of "hung >>> task" errors in dmesg/syslog). >>> >>> We've since moved to Netatalk 3.X which puts its CNID_DB files >>> elsewhere (we put them on local SSD RAID), and the lockups have >>> vanished. However, our split-brain files number in the tens of >>> thousands to to those previous lockups, and aren't always predictable >>> (i.e.: it's not always the case where brick0 is "good" and brick1 is >>> "bad"). Manually fixing the files is far too time consuming. >>> >>> I've written a rudimentary script that trawls >>> /var/log/glusterfs/glustershd.log for split-brain GFIDs, tracks it >>> down on the matching pair of bricks, and figures out via a few rules >>> (size tends to be a good indicator for us, as bigger files tend to be >>> more rencent ones) which is the "good" file. This works for about 80% >>> of files, which will dramatically reduce the amount of data we have to >>> manually check. >>> >>> My question is: what should I do from here? Options are: >>> >>> Option 1) Delete the file from the "bad" brick >>> >>> Option 2) rsync the file from the "good" brick to the "bad" brick >>> with -aX flag (preserve everything, including trusted.afr.$server and >>> trusted.gfid xattrs) >>> >>> Option 3) rsync the file from "good" to "bad", and then setfattr -x >>> trusted.* on the bad brick. >>> >>> Which of these is considered the better (more glustershd compatible) >>> option? Or alternatively, is there something else that's preferred? >>> >>> Normally I'd just test this on our backup gluster, however as it was >>> never running Netatalk, it has no split-brain problems, so I can't >>> test the functionality. >>> >>> Thanks for any insight provided, >>> >>> -Dan >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users >> >> >> >> >> -- >> Pete Smith >> DevOp/System Administrator >> Realise Studio >> 12/13 Poland Street, London W1F 8QB >> T. +44 (0)20 7165 9644 >> >> realisestudio.com