I have another clue to report: So I have my export directory as: /export Mounted as: /scratch If I do "ls -lR /scratch", it's supposed to synchronize all files and metadata, right? Well, it doesn't seem to be doing that. I have approx 100 files in one problematic folder. Only 50 show up to ls. That is, until I list it specifically. They also don't show up in the export directory until ls'd by name in /scratch. ls /scratch/file* # results in files1-49 being listed ls /export/file* # same result as above ls /export/file50.dat # no such file or directory ls /scratch/file50.dat # lists file as if nothing was ever wrong ls /export/file50.dat # shows up now after specific ls call in /scratch ls /scratch/file* # results in files 1-50 being listed now (magic?) ls /export/file* # also results in files 1-50 being listed now I'm considering doing a: for n in `seq 51-100` ; do ls /scratch/file$n.dat ; done just to recover the files. However, I'm delaying that so I can keep some in the problematic state should someone give me some additional debugging steps here. Don't get me wrong- I appreciate any help I can get w/ a free product like this. But I'm actually surprised that a report like this just seems to be hitting a dead end on this list in terms of responses. Isn't this alarming behavior? Somehow the filesystem got into a state where files still were recorded, but weren't represented until specifically listed. That should tell us something, but I'm no expert here. thx- Jeremy Jeremy Enos wrote: > Can anyone tell me if there's hope of recovering data here? Steps to > take? Anything? Is something wrong with my configuration? (raid1 over > raid0) If I don't have a clue what went wrong or why, or how to > recover, then even formatting and starting fresh doesn't lend much hope > in future reliability. > thx- > > Jeremy > > Jeremy Enos wrote: > >> plain text send... >> >> Jeremy Enos wrote: >> >>> What kind of tweaking and tampering was necessary to recover the lost >>> data? >>> >>> Jeremy >>> >>> My configuration: >>> Oh yes- of course- don't know why I left this out. Version and >>> config files follow. >>> >>> [jenos at ac glusterfs]$ rpm -qa |grep gluster >>> glusterfs-common-2.0.7-1.fc10.x86_64 >>> glusterfs-client-2.0.7-1.fc10.x86_64 >>> >>> >>> [jenos at ac glusterfs]$ cat glusterfs.vol >>> #-----------IB remotes------------------ >>> volume remote1 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac11 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote2 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac12 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote3 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac13 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote4 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac14 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote5 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac15 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote6 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac16 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote7 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac17 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote8 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac18 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote9 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac19 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> volume remote10 >>> type protocol/client >>> option transport-type ib-verbs/client >>> option remote-host ac20 >>> option remote-subvolume ibstripe >>> end-volume >>> >>> #----------Stripe and Replicate------------------ >>> >>> volume stripe1 >>> type cluster/stripe >>> option block-size 1MB >>> subvolumes remote1 remote2 remote3 remote4 remote5 >>> end-volume >>> >>> volume stripe2 >>> type cluster/stripe >>> option block-size 1MB >>> subvolumes remote6 remote7 remote8 remote9 remote10 >>> end-volume >>> >>> volume replicate >>> type cluster/replicate >>> option metadata-self-heal on >>> subvolumes stripe1 stripe2 >>> end-volume >>> >>> #------------Performance Options------------------- >>> >>> volume readahead >>> type performance/read-ahead >>> option page-count 4 # 2 is default option >>> option force-atime-update off # default is off >>> subvolumes replicate >>> end-volume >>> >>> volume writebehind >>> type performance/write-behind >>> option cache-size 1MB >>> subvolumes readahead >>> end-volume >>> >>> volume cache >>> type performance/io-cache >>> option cache-size 1GB >>> subvolumes writebehind >>> end-volume >>> >>> [jenos at ac glusterfs]$ cat glusterfsd.vol >>> volume posix >>> type storage/posix >>> option directory /export >>> end-volume >>> >>> volume locks >>> type features/locks >>> subvolumes posix >>> end-volume >>> >>> volume ibstripe >>> type performance/io-threads >>> option thread-count 4 >>> subvolumes locks >>> end-volume >>> >>> volume server-ib >>> type protocol/server >>> option transport-type ib-verbs/server >>> option auth.addr.ibstripe.allow * >>> subvolumes ibstripe >>> end-volume >>> >>> volume server-tcp >>> type protocol/server >>> option transport-type tcp/server >>> option auth.addr.ibstripe.allow * >>> subvolumes ibstripe >>> end-volume >>> >>> [jenos at ac glusterfs]$ >>> >>> >>> >>> Krzysztof Strasburger wrote: >>> >>>> On Wed, Nov 04, 2009 at 01:31:30AM -0600, Jeremy Enos wrote: >>>> >>>> >>>>> Hi- >>>>> I've got a problem where certain batches of files written out to >>>>> gluster have disappeared. Also, newly created files sometimes >>>>> don't show up to ls unless they are explicitly specified to ls and >>>>> other tools. >>>>> >>>>> In my export folder, everything appears fine. >>>>> I have found that when I touch the missing file in gluster, it >>>>> comes back, shows a file size, but appears empty. I've tried >>>>> umounting, restarting all glusterfsds, remounting, and it stayed >>>>> the same. Also, this problem did not show up immediately after >>>>> setting up the filesystem, at least during basic tests. Any ideas? >>>>> >>>>> >>>> What is your configuration? I experienced similar problems with unify >>>> after a disk crash. The namespace (replicated) was not rebuilt >>>> correctly >>>> after replacing the failing unit and I had to add some files manually >>>> (OK, using a script, but an intervention was needed). No data loss, >>>> only a bit of tweaking and tampering ;). >>>> Krzysztof >>>>