Liam Slusser wrote:
Even with manually fixing (adding or removing) the extended attributes i
was never able to get Gluster to see the missing files. So i ended up
writing a quick program that searched the raw bricks filesystem and then
checked to make sure the file existed in the Gluster cluster and if it
didn't it would tag the file. Once that job was done i shut down
Gluster, moved all the missing files off the raw bricks into temp
storage, and then i restarted Gluster and copied all the files back into
each directory. That fixed the missing file problems.
Id still like to find out why Gluster would ignore certain files without
the correct attributes. Even removing all the file attributes wouldn't
fix the problem. I also tried manually coping a file into a brick which
it still wouldn't find. It would be nice to be able to manual copy
files into a brick, then set an extended attribute flag which would
cause gluster to see the new file(s) and copy them to all bricks after a
ls -alR was done. Or even better just do it automatically when new
files without attributes are found in a brick.
It sounds like you are experiencing this known yet dangerous bug:
http://gluster.org/docs/index.php/Understanding_AFR_Translator#Known_Issues
Quote:
Self-heal of a file that does not exist on the first subvolume:
If a file does not exist on the first subvolume but exists on some other
subvolume, it will not show up in the output of 'ls'. This is because
the replicate translator fetches the directory listing only from the
first subvolume. Thus, the file that does not exist on the first
subvolume is never seen and never healed. However, if you know the name
of the file and do a 'stat' on the file or try to access it in any other
way, the file will be properly healed and created on the first subvolume.
So, either the directory listing should be fetched from the
read-subvolume, or better, fetched from all nodes (but that gets slow).
At least if it was fetched from the read-subvolume, you could run a cron
job on each server that ls -laR, which would force the files into sync
(since each server probably has itself as the read-subvolume, so the
missing files will be found). But that's not how it seems to work at the
moment.
Gordan