No problem. Glad to be of help. On 14 April 2013 05:31, Daniel Mons <daemons at kanuka.com.au> wrote: > Running the following script on an unimportant tree in the cluster > this weekend as a test. So far, so good, and it appears to be doing > what I want. > > Thanks again Pete for the recommendation. > > -Dan > > > #!/bin/bash > BR='-------------------------' > UUID=$(/usr/bin/uuidgen) > if [ "$UUID" == "" ] > then > echo "UUID is null" > exit 1 > fi > find "/mnt/blah/" -type f | while read FILE > do > DNAME=$(dirname "${FILE}") > FNAME=$(basename "${FILE}") > cd "${DNAME}" > if (( $? > 0 )) > then > echo "Bad cd operation" > exit 1 > fi > pwd > mv -v "${FNAME}" "${FNAME}.${UUID}" > if (( $? > 0 )) > then > echo "Bad mv operation" > exit 1 > fi > cp -pv "${FNAME}.${UUID}" "${FNAME}" > if (( $? > 0 )) > then > echo "Bad cp operation" > exit 1 > fi > rm -fv "${FNAME}.${UUID}" > if (( $? > 0 )) > then > echo "Bad rm operation" > exit 1 > fi > echo "${BR}" > done > > > On 11 April 2013 22:41, Daniel Mons <daemons at kanuka.com.au> wrote: > > Hi Pete, > > > > Thanks for that link. I'm going to try this en mass on an unimportant > > directory over the weekend. > > > > -Dan > > > > > > On 11 April 2013 01:41, Pete Smith <pete at realisestudio.com> wrote: > >> Hi Dan > >> > >> I've come up against this recently whilst trying to delete large > amounts of > >> files from our cluster. > >> > >> I'm resolving it with the method from > >> http://comments.gmane.org/gmane.comp.file-systems.gluster.user/1917 > >> > >> With Fabric as a helping hand, it's not too tedious. > >> > >> Not sure about the level of glustershd compatibiity, but it's working > for > >> me. > >> > >> HTH > >> > >> Pete > >> -- > >> > >> > >> On 10 April 2013 11:44, Daniel Mons <daemons at kanuka.com.au> wrote: > >>> > >>> Our production GlusterFS 3.3.1GA setup is a 3x2 distribute-replicate, > >>> with 100TB usable for staff. This is one of 4 identical GlusterFS > >>> clusters we're running. > >>> > >>> Very early in the life of our production Gluster rollout, we ran > >>> Netatalk 2.X to share files with MacOSX clients (due to slow negative > >>> lookup on CIFS/Samba for those pesky resource fork files in MacOSX's > >>> Finder). Netatalk 2.X wrote it's CNID_DB files back to Gluster, which > >>> caused enormous IO, locking up many nodes at a time (lots of "hung > >>> task" errors in dmesg/syslog). > >>> > >>> We've since moved to Netatalk 3.X which puts its CNID_DB files > >>> elsewhere (we put them on local SSD RAID), and the lockups have > >>> vanished. However, our split-brain files number in the tens of > >>> thousands to to those previous lockups, and aren't always predictable > >>> (i.e.: it's not always the case where brick0 is "good" and brick1 is > >>> "bad"). Manually fixing the files is far too time consuming. > >>> > >>> I've written a rudimentary script that trawls > >>> /var/log/glusterfs/glustershd.log for split-brain GFIDs, tracks it > >>> down on the matching pair of bricks, and figures out via a few rules > >>> (size tends to be a good indicator for us, as bigger files tend to be > >>> more rencent ones) which is the "good" file. This works for about 80% > >>> of files, which will dramatically reduce the amount of data we have to > >>> manually check. > >>> > >>> My question is: what should I do from here? Options are: > >>> > >>> Option 1) Delete the file from the "bad" brick > >>> > >>> Option 2) rsync the file from the "good" brick to the "bad" brick > >>> with -aX flag (preserve everything, including trusted.afr.$server and > >>> trusted.gfid xattrs) > >>> > >>> Option 3) rsync the file from "good" to "bad", and then setfattr -x > >>> trusted.* on the bad brick. > >>> > >>> Which of these is considered the better (more glustershd compatible) > >>> option? Or alternatively, is there something else that's preferred? > >>> > >>> Normally I'd just test this on our backup gluster, however as it was > >>> never running Netatalk, it has no split-brain problems, so I can't > >>> test the functionality. > >>> > >>> Thanks for any insight provided, > >>> > >>> -Dan > >>> _______________________________________________ > >>> Gluster-users mailing list > >>> Gluster-users at gluster.org > >>> http://supercolony.gluster.org/mailman/listinfo/gluster-users > >> > >> > >> > >> > >> -- > >> Pete Smith > >> DevOp/System Administrator > >> Realise Studio > >> 12/13 Poland Street, London W1F 8QB > >> T. +44 (0)20 7165 9644 > >> > >> realisestudio.com > -- Pete Smith DevOp/System Administrator Realise Studio 12/13 Poland Street, London W1F 8QB T. +44 (0)20 7165 9644 realisestudio.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130415/04b9d0c2/attachment.html>