Martin Fick wrote:
--- Martin Fick <mogulguy@xxxxxxxxx> wrote:
Original creation process and versioning:
/
v1
/dir1/
v2 v1
/dir1/dir2/
v2 v2 v1
/dir1/dir2/file
v2 v2 v2 v1
Mirror goes off-line with version #s of dir2 and
file as: v2/v1.
-> file deleted
/dir1/dir2/
v2 v2 v3
-> dir2 deleted
/dir1/
v2 v3
-> dir2 recreated
/dir1/dir2/
v2 v4 v1
-> file recreated
/dir1/dir2/file
v2 v4 v2 v1
...
However, if we were looking at the versions all the
way to the root, when the mirror went off-line we
would have had: /v2/v2/v2/v1 and now we have:
/v2/v4/v2/v1. There is a chance that we are
talking about different files now. Of course, the
problem I see now is that the files could in fact
have been the same even though the version number is
different with this scheme! Since the only version
# that is different is that of dir1 (v4), this could
have been caused by simply adding two new files to
that directory!
Hmm, I think that my logic may have been flawed here
and that the scheme would actually work (as long as
you go to the root). The mismatch above would only
exist if in fact the file had been recreated! If the
file had not been recreated, its version # would still
be /v2/v2/v2/v1 and even though if you were to
recalculate it now it would yield /v2/v4/v2/v1. But
we are not recalculating it, we are trying to see
if the files on two subnodes were created at the
same time, and thus the version history should have
been the same right?
This assumption only holds if the parent directories
all the way to root are healed before a file is
created/modified though. I am, not sure that it
currently does with AFR? Does it?
If the parent directories (all the way up) are not
healed, then a version mismatch could be created
when a file is modified and its version is updated.
In this case, despite the version mismatches, the
files are in fact the same. It does not seem like
it would be too difficult to force the parent
directories to heal before writing to the file.
Unless, a directory heal causes all changed file
data (or just new files+data?) in those directories
to heal, that could be a long delay. Thoughts? I
must admit, I am having a hard time following all
these constraints. :) ... If this works, no
useless resyncing because we thought that files
have changed as I previously surmised.
If you increment directory version numbers on all directory listing
changes, I still see a major problem:
1. Adding, renaming, or removing a file or directory in ANY directory
now cascades the version number change up to the root directory,
effectively incrementing the version number of ALL files and marking
them as dirty/needing update to all other servers. I hope you agree
this is Very Bad (tm). You could solve it with checksums, but as
someone pointed out, that could get expensive, even with a checksum
cache, when the entire tree needs to be checked every time.
I believe that this cascade and healing is necessary is illustrated in
the following example: given a synchronized /a/b/c/file, against server 1:
$ cd /
$ mv a z
$ mkdir -p a/b/c
$ echo whatever >file
Then, against server 2:
$ cat /a/b/c/file
Would have to know to heal directory listings all the way up to its root
directory listing to give the correct answer here.
I think the single, global version number I mentioned in the "Client
side AFR race conditions" provides an interesting solution here.
Consider the following commands and their corresponding file system
states starting with an empty root. In this model, changing the
content/version number of any child element is considered to change the
directory listing of the parent, and renames update the version number
of all children of the renamed element:
/ v1
$ mkdir /a
/ v2
/a v2
$ mkdir /b
/ v3
/a v2
/b v3
$ echo whatever > /a/1
/ v4
/a v4
/a/1 v4
/b v3
$ echo whatever > /a/2
/ v5
/a v5
/a/1 v4
/a/2 v5
/b v3
$ mv /a /z
/ v6
/b v3
/z v6
/z/1 v6
/z/2 v6
$ rm /z/2
/ v7
/b v3
/a v7
/a/1 v6
This glosses over the locking issues we were discussing in the other
thread, but in this model, a client can quickly determine whether its
copy of any directory listing or file is up to date based on solely that
file or directory's own version number (locally and on the server), and
giving a parent directory a new version number does not invalidate the
data of all its children.
Regards,
Derek
--
Derek R. Price
Solutions Architect
Ximbiot, LLC <http://ximbiot.com>
Get CVS and Subversion Support from Ximbiot!
v: +1 248.835.1260
f: +1 248.246.1176