On Wed, Jul 25, 2012 at 11:20 PM, Fred van Zwieten <fvzwieten@xxxxxxxxxxxxx> wrote:
"Now I am leaning towards git based versioning. Integrate git intoI am not sure I would like that. Our idea is to make the previous versions (read-only!) available to the end-users through a separate mount-point, taking file permissions into account. I am not sure if that is at all possible when they live inside a git repository.
GlusterFS to track changes on specified events (timer, file-close,
dir-tree-modify..). We may not do this via translator interface, but
through the newly proposed simple event/timer interface. "
(disclaimer: I do not know the inner workings of glusterfs nor translators) I would think making it part (of the receiving part) of geo-replicator translator would be ideal because it knows what is going on. If a file /a/b/c is updated it's previous version could be stored as /pre/a/b/c.<datetime> or /pre/<datetime>/a/b/c. If the previous versions live on the same file-system you could even play with inodes to keep only the previous versions of blocks. This would make it very space efficient (sort of file based snapshotting).
I do agree that using git makes it more modular and independent of the geo-replicator translator.
I am also curious how you would handle multiple writes in a short time to the same file without ending up with an equal amount of previous versions.
Also, I can't find the note you are referring to. Could you please make a feature wiki page using the template?
Fred
We broke
GeoReplication into two parts: (1) Marker - change tracking translator
and (2) a simple queue - query changes and invoke rsync with specific
list of files over ssh. Unlike inotify, marker framework keeps track of
changes with in the filesystem as xtime in extended attributes. You can
ask the filesystem to list all changed files and folders since a
particular point in time. This way, external service can tolerate crash,
WAN link failure, etc. Marker allows developers to extend storage
capabilities using simple application programming model (even scripting
languages are OK).
If certain tasks can be achieved outside of a
translator, it is good to do so. Just like kernel mode , translator mode
has some limitations. Translator code has to be efficient, asynchronous
(event driven), latency sensitive and free of memory leaks. If we
extend the marker framework idea into generic event hook mechanism, we
can develop powerful storage applications outside of the translator
mechanism. Say you register your tool or script for certain events. When
the event occurs, your code gets invoked with necessary parameters. You
could then operate on the mounted filesystem itself, just as any other
application. For example, you register a git script for invocation on a
event say "when ever a registered directory tree is modified and time
elapsed more than 30 mins". All this script does is, push changes to
external origin. It is crude and simple, but achieves the goal. Simple
is better. You may also develop anti-virus plugins or silent data
corruption checks using this technique. Users can use simple git
checkout for flip views. Because git doesn't scale for large content,
you can limit users to explicitly register interested folders for
versioning. If you want to create a mountable of remote content, you can
write a translator to trap chdir or lookup for a directories named
after timestamp and perform git checkout. If I use git for continuous
automated file system versioning, I will suggest users to use git tool
itself as the UI.
I am just giving you tips and suggestions. Don't limit your ideas any way.
If I am guessing your idea correctly, it will have few limitations, but can live with it.
* Only files are versioned. Directories are not.
* File renames and Directory renames (mv) are not supported.
* Every version is a complete duplicate copy (not as COW or WAFL).
* Changes are tracked at per file level. Changes across a directory
tree are not grouped. I mean cvs style, not like git as a patch set.
It
is actually OK to make duplicate copies of changed files. In reality,
for most practical use cases, very few files across the name space gets
modified. Most of the files are written once and rarely modified. Files
older than 30 days are hardly accessed. So it is OK to store duplicate
copies of just the changed files. btrfs or device-mapper dedup may may
take care of this as well. I won't worry too much about duplicating
data, given its very small proportion.
I didn't quite understand how you can play with inodes to avoid this duplication. Did you mean btrfs dedup like capability?.
If
you want to avoid these limitations, think about rdiff-backup style
continuous automated backup. Just like georep, you monitor the
filesystem for changes and backup on a continuous basis. It is OK to
give users a tool or API to restore/view older versions. This is much
simpler to implement than WAFL or COW style storage format and file
level snapshoting.
Anand,
These are all "design" decisions that we do not need and even make the product less usefull in our use-case.
We have a large archive of tiff files. Every tiff file is large (50+ mb). The images themselves do not get modified, but their EXIF metadata does. There are also file renames and they get re-arranged into different directory structures. For this archive we need scalable filesystem with georep to second location _and_ file versioning.
"Because git doesn't scale for large content, you can limit users to explicitly register interested folders for versioning"
Now, it seems to me git does not fit this bill, because it doesn't scale very well.
"* File renames and Directory renames (mv) are not supported"
If you mean building up retention on file renames and moves i agree for our use-case, but other might need it. Look at backuppc for a cool solution on that.
"* Every version is a complete duplicate copy (not as COW or WAFL)."
The fact that each version is a complete duplicate is not very storage friendly, because in out use-case only the EXIF metadata changes. I seek rdiff-backup like functionality there.
"It
is actually OK to make duplicate copies of changed files. In reality,
for most practical use cases, very few files across the name space gets
modified. Most of the files are written once and rarely modified. Files
older than 30 days are hardly accessed. So it is OK to store duplicate
copies of just the changed files. btrfs or device-mapper dedup may may
take care of this as well. I won't worry too much about duplicating
data, given its very small proportion. "These are all "design" decisions that we do not need and even make the product less usefull in our use-case.
We have a large archive of tiff files. Every tiff file is large (50+ mb). The images themselves do not get modified, but their EXIF metadata does. There are also file renames and they get re-arranged into different directory structures. For this archive we need scalable filesystem with georep to second location _and_ file versioning.
"Because git doesn't scale for large content, you can limit users to explicitly register interested folders for versioning"
Now, it seems to me git does not fit this bill, because it doesn't scale very well.
"* File renames and Directory renames (mv) are not supported"
If you mean building up retention on file renames and moves i agree for our use-case, but other might need it. Look at backuppc for a cool solution on that.
"* Every version is a complete duplicate copy (not as COW or WAFL)."
The fact that each version is a complete duplicate is not very storage friendly, because in out use-case only the EXIF metadata changes. I seek rdiff-backup like functionality there.
I do not agree with you. If you say most of the files are written once and rarely modifed you are narrowing the usecase for glusterfs. You are describing near worm. Out use-case is not that. Also, our files also get modified after 30 days. Relying on dedup on the lower fs level is also not good. Suppose you have a 200TB filesystem. That would take post-proces dedup a very long time to find the dups. Better to do it inline. Again, look a backuppc for an implementation example.
Fred