It will be good to have these monitoring capabilities rolled in Gluster so that one can identify issues with GFID, xattr mismatch etc proactively. Currently, there is no such feature. You find out the hardway when clients are impacted. On Thu, Jul 14, 2011 at 1:35 PM, John Mark Walker <jwalker at gluster.com> wrote: > Joe - thanks for taking the time to write this up. It sounds like the issue this is designed to fix is related to the GFID mismatch issue that we released a preventive fix for today. > > The sanity checks could be useful, though. Does today's release change anything with respect to your tools? > > -John Mark > Gluster Community Guy > > > ________________________________________ > From: gluster-users-bounces at gluster.org [gluster-users-bounces at gluster.org] on behalf of Joe Landman [landman at scalableinformatics.com] > Sent: Wednesday, July 13, 2011 9:15 PM > To: gluster-users > Subject: Tools for the admin > > Hi folks > > ? We have run into a number of problems with missing files (among other > things). ?So I went hunting for the files. ?Along the way, I came up > with some very simple sanity checks and tools for helping to correct > situations. ?They will not work on striped data ... sorry. > > Sanity check #1: conservation of number of files > > The sum of the number of files on your backing stores (excluding links > and directories) should equal (with possible minor variance due to > gluster internals) the sum of the number of files (excluding links and > directories) in your gluster volumes. > > If you have say, 6 bricks, each with nearly 1M files, and a dht volume > built from those bricks, you really ... REALLY ... shouldn't have only > 1.8M files in your volume. ?If you do, then some files are missing from > the volume (really). ?You can tell what these files are, as they have no > xattr. ?Yeah. ?Really. > > How can you enumerate what you have? > > Simple. ?Meet file_accounting.pl ?(available at > (http://download.scalableinformatics.com/gluster/utils/) > > This handy utility will tell you important things about your file system. > > [root at jr4-1 temp]# /data/tiburon/install/scan/file_accounting.pl > --bspath=/data/brick-sdc2/dht/ > Number of entries: 944604 > Number of links ?: 6711 > Number of dir ? ?: 102825 > Number of files ?: 834794 > > --bspath is the "backing store path", where the files reside. ?It works > just as well on your gluster volume, which allows you to inspect your > sanity with appropriate sums. > > So you need to copy these into the volume. ?And move them out of the way > first ?before copy in. > > Which leads to tool #2 and #3. ?First, you need to scan your backing > store file system for the files. > > Tool #2: scan_gluster.pl > > /data/tiburon/install/scan/scan_gluster.pl --bspath=/data/brick-sdc2/dht > ?> /data/brick-sdc2/temp/sdc2.data > > will grab lots of nice info about the file, including the attributes. > You can now use grep against the sdc2.data file and look only for > 'attr=,' and those will be things gluster knows varying degrees about in > your file system. ?Some things, specifically files with this condition, > yeah, those are missing files. ?The ones I was trying to find. > > If you have a user who notes that files occasionally go missing, yeah, > this can help you find them if they exist on the backing store. ?Which > they probably do. > > The next tool is dangerous. ?So far all we have done is to scan the > backing store. ?Now we are going to make changes. ?No, don't worry, its > actually ... almost ... safe. ?We do a file move to another location > (preferably on the same device/mount point in the backing store), then a > copy into gluster volume (yes you need to mount it on your brick nodes). > ?The danger is in modifying a gluster file system backend. ?Don't do > this. ?Ever. ?Unless 3/4 of your files go missing. > > And, by the way, we have a handy dandy --md5 switch on there, if you > want the scan to take forever. > > Tool #3: data_mover.pl > > This will do the dirty work. ?It parses the output of scan_gluster, and > makes changes. ?There is a --dryrun option for those who want to try it, > and a -T number ? option to specify the number of changes to make to the > file. ?Allows you to try it (hence the T ... for TRY) on some number of > files. ?It will preserve ownership and permission mask (ohhh ahhh ... > shiny!). ?The --tmp option happily sets your temporary directory. > Verbose and debug should be obvious. > > nohup ./data_mover.pl --data sdd2-nomd5.data --debug --verbose ?--tmp > `pwd`/tmp ?-T 2000000 >> out 2>&1 & > > > Note: ?all of these tools currently use /opt/scalable/bin/perl as the > interpreter. ?This is because our Perl build (5.12.3) includes all the > bits we need to make this work. ?If you want to use them, you are > welcome to change /opt/scalable/bin/perl to /usr/bin/perl, and they you > will have to install a few modules > > ? ? ? ?cpan Getopt::Lucid File::ExtAttr > > If you have an issue with either, please let me know. > > We can turn these into binaries if someone needs. ?Source is at > > http://download.scalableinformatics.com/gluster/utils/ > > Let me know (offline) if you run into problems if you decide to give > them a try. ?Note, they are GPL2 (no license tag on them), no warranty, > and data_mover.pl will MOST DEFINITELY DESTROY DATA. ?We aren't liable > for any damages if you use it. ?Caveat Emptor. ?Let the admin beware. > Did I mention that data_mover.pl WILL DESTROY YOUR DATA? ? I am not sure > if I did. ?So here it is again. ?data_mover.pl WILL DESTROY YOUR DATA. > > Don't use these unless you have a backup. ?Especially data_mover.pl. > Because IT WILL MOST DEFINITELY DESTROY YOUR DATA. ?Might even bite your > dog, egg your house, and do all sorts of other nastiness. ?It will > increase entropy in the universe. > > But if you are staring at the rear end of 3.8M missing files, wondering > WTF, mebbe ... that data lossage thing doesn't sound so bad. ?Especially > if you can reverse it. > > So feel free to look them over. ?I plan to hone them over time, and > refine them. ?Add documentation even. ?If they prove useful enough for > people to use, please let me know. ?And they are GPL v2. > > The site is http://download.scalableinformatics.com/gluster/utils/ . > > -- > Joseph Landman, Ph.D > Founder and CEO > Scalable Informatics, Inc. > email: landman at scalableinformatics.com > web ?: http://scalableinformatics.com > ? ? ? ?http://scalableinformatics.com/sicluster > phone: +1 734 786 8423 x121 > fax ?: +1 866 888 3112 > cell : +1 734 612 4615 > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >