On Wed, Nov 14, 2007 at 02:22:51PM -0500, Jon Smirl wrote: > On 11/14/07, J. Bruce Fields <bfields@xxxxxxxxxxxx> wrote: > > On Wed, Nov 14, 2007 at 04:30:16PM +0100, Andi Kleen wrote: > > > "Jon Smirl" <jonsmirl@xxxxxxxxx> writes: > > > > > > > On 11/14/07, Chuck Lever <chuck.lever@xxxxxxxxxx> wrote: > > > >> On Nov 13, 2007, at 7:04 PM, Jon Smirl wrote: > > > >> > Is it feasible to do something like this in the linux file system > > > >> > architecture? > > > >> > > > > >> > Beagle beats on my disk for an hour when I reboot. Of course I don't > > > >> > like that and I shut Beagle off. > > > >> > > > >> Leopard, by the way, does exactly this: it has a daemon that starts > > > >> at boot time and taps FSEvents then journals file system changes to a > > > >> well-known file on local disk. > > > > > > > > Logging file systems have all of the needed info. > > > > > > Actually most journaling file systems in Linux use block logging and > > > it would be probably hard to get specific file names out of a random > > > collection of logged blocks. And even if you could they would > > > hit a lot of false positives since everything is rounded up > > > to block level. > > > > > > With intent logging like in XFS/JFS it would be easier, but even > > > then costly :- e.g. they might log changes to the inode but > > > there is no back pointer to the file name short of searching the > > > whole directory tree. > > > > So it seems the best approach given the current api's would be just to > > cache all the stat data, and stat every file on reboot. > > > > I don't understand why beagle is reading the entire filesystem data. I > > understand why even just doing the stat's could be prohibitive, though. > > I believe Beagle is looking at the mtimes on the files. It uses xattrs > to store the last mtime it checked and then compares it to the current > mtime. It also stores a hash of the file in an xattr. So even if the You meant "only if", not "even if"? > mtimes don't match it recomputes the hash and only if the hashes > differ do it update its free text search index. OK, that makes a little more sense. (Though it seems unfortunate to use xattrs instead of caching the data elsewhere. Git and nfs e.g. both use the ctime to decide when a file changes, so you're invalidating their caches unnecessarily.) --b. - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html