Hopefully, this will be enough of a hint for the developers to be able to
tell what's going on with the mtime self-heal issue. I'm guessing that
Gluster writes via NFS are marked as requiring self-healing, even if
they're fine. Self-heal gets triggered and messes up the mtime in
the process. So, that may be two different bugs...
Before self-heal, a file has attributes as follows (on both halves of the
mirror):
# file: disk1/glusterfs/software2/xwin32.bat
trusted.afr.brick4-1a=0sAAAAAQAAAAAAAAAA
trusted.afr.brick4-1b=0sAAAAAQAAAAAAAAAA
The directory has:
trusted.afr.brick4-1a=0sAAAAAAAAAAAAAAAA
trusted.afr.brick4-1b=0sAAAAAAAAAAAAAAAA
trusted.glusterfs.dht=0sAAAAAQAAAAB//////////w==
After the self-heal, there's no change in directory attributes, but the
file attributes change on both nodes to:
# file: disk1/glusterfs/software2/xwin32.bat
trusted.afr.brick4-1a=0sAAAAAAAAAAAAAAAA
trusted.afr.brick4-1b=0sAAAAAAAAAAAAAAAA
Thanks,
Brent
On Wed, 13 May 2009, Brent A Nelson wrote:
Further insight: the initial mtimes are correct (and probably are for rsync,
as well); the mtimes are getting modified by self-heal. I'm not sure why
self-heal did anything at all, as the mirrors should have been in sync, but
it triggers on ls and quickly changes the mtimes on the second half of the
mirror.
Thanks,
Brent
On Wed, 13 May 2009, Brent A Nelson wrote:
I take that back; it doesn't seem to help, at least for the initial rsync.
Shouldn't it, though? If a file on 1/2 of the mirror has a different mtime
than the file on the other half, shouldn't self-heal fix it?
The early indication was that it did seem to help in the case that I
manually removed a file from the second half of the mirror. With the
metadata-change-log option on, the file self-healed with the correct mtime;
without the metadata-change-log option, it did not. I only tried it once,
though, so it might have just been coincidence.
I also saw the mtime issue with "cp -a". It appeared to occur in one brief
burst, and this burst spanned multiple mirrors (a few files created at
4:46pm, all in the same specific directory, for two different mirrors and
different server nodes, have bad mtimes on the second half of each mirror).
This was far less common than rsync, however.
Thanks,
Brent
On Wed, 13 May 2009, Brent A Nelson wrote:
Early indications are that setting "option metadata-change-log on" for
cluster/replicate is a likely workaround for the mtime issue. I'll start
over, and see if the issue is truly gone with this option in place...
It might be worth considering defaulting this option to "on".
Thanks,
Brent
On Wed, 13 May 2009, Brent A Nelson wrote:
On Wed, 13 May 2009, Brent A Nelson wrote:
With regards to the incorrect modification time appearing on some files,
I note the following:
ls -l on Node1 in a mirror:
-r--r--r-- 1 root root 40280 2008-03-17 12:03
/disk1/glusterfs/tftpboot/hardy64/pool/main/t/tasksel/tasksel-data_2.70ubuntu4_all.deb
Node 2 in the same mirror:
-r--r--r-- 1 root root 40280 2009-05-12 19:40
/disk1/glusterfs/tftpboot/hardy64/pool/main/t/tasksel/tasksel-data_2.70ubuntu4_all.deb
It appears that 1/2 of the mirror set the modification time correctly,
but the other half did not.
Just a bit of additional info. It appears that the first half of the
mirror has correct mtimes. The second half of the mirror has wrong
mtimes on all files, but directory mtimes are fine. When you go to view
the mtimes on the GlusterFS, sometimes you will get the mtime from one
node, sometimes the other, hence the seeming randomness.
Also, I see that zero-length files have correct mtimes on both halves of
a mirror.
Thanks,
Brent