Re: Tracking File Creations, Modifications, and Deletions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/07/2009 00:28, Drew Morris wrote:
Hi All...
We are developing a custom translator to log modifications to files
(including creation, update and deletion)

mtime attribute?

into database.

Have you looked into SeznamFS?

*Our Current Approach:*
By reviewing the Gluster and FUSE source code and documentation, we
concluded that the following FOPs should be monitored for this purpose:
open, create, mknod, truncate, ftruncate, writev, flush, release, unlink and
rename.

You should really look into SeznamFS.

We would like to insert one record per each file modification, hence we
need a mechanism to aggregate multiple operations such as open, writev
and flush over one file-descriptor into a single update.

For performance sake and preventing dirty reads, we would like to do
a database row insertion in the callback of the very last action that is
performed. By other means, during write we just set flags as modified
in file descriptor context and perform the insert in the very last action.

The major issue is that (as most of the docs and FAQ indicated) there
is no reliable mechanism to decide which FOP action is the last one.

If I'm following what you are saying, that's not sensibly doable because you never know if there will be another operation. You have to treat each op as the last one, because you don't know what happens next. So you'll have to log all of them, and if you only ever want one of them, key them by file path hash in your DB so that each op overwrites the previous log. But if you're doing that, you might as well just to a recursive scan for mtime to see what's changed and take it from there.

We monitored file system interaction via trace module and noticed
that the flush action is called several times and release is never invoked
in many cases.

Bug?

This issue forced us to log the very first flush which is quite problematic
for a number of reasons including the fact that we can never be sure the
operation is finished before triggering any of our asynchronous operations
and we are slowing down the initial write because we are waiting on the
log action to complete.

Have you tried it using a dummy FS, rather than piggybacking on GlusterFS? If so, did you observe the same flush/release behaviour?

*Question:*
Does anyone have a better solution for this issue? Perhaps there should
be a mechanism to notify us of the closing of a file, otherwise an open
file descriptor will remain forever.
We would really love to find any other reliable method that allows us to
track these operations at a higher level.

We would greatly appreciate any new approach that can overcome these
deficiencies.

Other than SeznamFS which I mentioned above, perhaps CopyFS might give you a better base to work on? The sort of thing you are describing doesn't strike me as a major use-case for GlusterFS.

Gordan




[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux