On Tue, Jun 29, 2004 at 21:31:18 +0530, Siddhartha Jain wrote: > I guess at this point I should discuss what I intend to do and try to weed > out any design flaws. > > Basically, I intend to create a module that, when loaded, allows for > file/directory replication. So if /home is meant to be replicated to /opt > then /etc can contain a config file that has a line like: > /home /opt > > I chose to place my hooks on the VFS layer so that the implementation is FS > independant. I believe that in Linux 2.6, there is a concept of a stackable filesystem. It should be somehow possible to mount a filesystem over another and the filesystem mounted on top should be able to access the filesystem below, provided it's designed to do that. In other words, I think you should NOT do it in VFS, but do it as a special filesystem. This filesystem would implement all the methods of dentry, inode and file and in these methods, redispatch to another FS driver (by doing manual path_walk and calling respective inode/dentry/file objects). It's also possible to use the coda interface to kernel and implement your filesystem in userland. There are libraries -- fuse and lufs -- that will help you with that. After all, when I think of it, coda itself is quite probably a good solution for you, without writing a single line of code. > The flow would be - If a sys_open is issued with any sort of write mode then > the sys_open function should check if the file is meant to be replicated. If > yes, then it should also open the replica. The file descriptor of the source > and destination files should be stored in a global data structure. > > Now when, the file is written to by sys_write (or sys_writev etc), the > function needs to check if the file descriptor passed to it is listed in the > data structure created by sys_open above. If yes, it writes the data written > to source to replica also. > > sys_close would check if the descriptor passed to it is listed in the > replication data structure and close the replica fd as well. > > A global flag should be set by the module to indicate whether sys_open, > sys_write and sys_close should deviate and check for replica file/fd. > > Unloading the module should unset the global flag for replication and clear > out of the file descriptor data structure. > > All functions doing the various jobs of file path comparison, data structure > updation etc would go in the module so that minimal changes are done to the > original functions. I guess there will be still be some concurrency issues > to deal with? > > Does this sound like a proper and feasible design? No. That's error-prone and dirty. Actulay, first have a look on the four replicating network filesystems available for linux. It's quite likely, that what you need can work sufficiently well with one of them. These are: * coda: This one is distributed with kernel (well, since the server as well as most of the client is in userland, you will need those, of course). It is a replicating filesystem, where the local replica behaves like a cache. It should also allow to explicitely say, that a file must be cached, so you can disconnect the computer, continue to work disconnected and then sync the changes on reconnect. It needs to have whole file in local replica before opening it, but that is not necessarily a problem. The server needs a dedicated partition, that can only be accessed throut it, though. It should be quite well tested and stable. * intermezzo: This is similar to coda in features, but lighter in implementation (it actualy uses normal http server for serving file content). It is younger and thus a bit less tested. It is also supported on less other OSes. * afs: This filesystem predates, and inspires, coda. It has the client fully in kernel. It does not support disconnected operation. On the other hand since it caches blocks and not whole files, it works better real-time. It's drawback is, that the driver is not that stable. It's not that bad, however. We have it in a computer lab and it works just well (NFS used to be much worse), even across different systems (linux, solaris, irix and windows). It is not shipped with kernel. * lustre: This is a brand new thing. It's a distributed filesystem designed for large clusters. It's designed to scale to tens of thousands of nodes. Quite probably an overkill, but who knows. ------------------------------------------------------------------------------- Jan 'Bulb' Hudec <bulb@ucw.cz>
Attachment:
signature.asc
Description: Digital signature