Hello Todd and Gluster-users, The same thing happened to one of my volumes the last time I tried a rebalance...migrate-data operation. I reported it to the list here: http://gluster.org/pipermail/gluster-users/2012-January/009343.html Fortunately it happened to a volume I was using mainly for backups, so I decided to start again from scratch rather than try to clean up the volume. I would really like to have a working migrate-data feature because my volumes have all been expanded many times without migrate-data being performed. I am worried that it might never be possible to do it successfully now that most of the files are on the wrong bricks. I came across "multiple subvolumes" errors on another occasion when migrate-data had not been performed, and that time only a handful of files were affected so I was able to clean up the errors manually. One version of each duplicated file was zero bytes, so it was easy to decide which were the correct versions. I have no idea what caused the zero byte versions to be created, but I thought it might have been the legacy of GFID related bugs in earlier versions of GlusterFS. There were several occasions when I had problems running fix-layout after expanding a volume, and I thought this might have messed up the extended attributes enough to end up with files of the same name on different bricks. I did also wonder if the zero byte duplicates might have been created because glusterd crashed or stopped responding, but I couldn't find anything in the logs to support this theory. -Dan. On 02/26/2012 07:00 PM, gluster-users-request at gluster.org wrote: > Date: Sun, 26 Feb 2012 11:17:53 -0500 (EST) > From: Todd Pfaff<pfaff at rhpcs.mcmaster.ca> > Subject: cleaning up duplicate files > To:gluster-users at gluster.org > Message-ID: > <alpine.LMD.2.00.1202261043320.29413 at rhpcserv.rhpcs.mcmaster.ca> > Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII > > I'm using gluster 3.2.5. I have a situation where I've somehow gotten > multiple copies of some files on back-end bricks that are members of the > same distribute volume set. Accessing these files from the front-end > volume results in an Input/Output error. I don't know how I got into > this situation and I don't really care about that at the moment. I'd > just like to fix the problem now without having to go to the extreme > of removing everything from the bricks. > > I'd do the fixing manually if it were a small number of files but there > are thousands. > > Is there any gluster operation that can automatically fix such cases? > > Alternatively, short of removing everything from back-end bricks and > starting from a clean slate, has anyone written code to find and fix such > duplicate files? > > Fortunately these files are backups so if I do have to remove them > completely the primary copy still exists elsewhere. > > Regards, > Todd >