On Thu, Apr 14, 2011 at 12:49 PM, Jan Kara <jack@xxxxxxx> wrote: > On Thu 14-04-11 12:36:40, Amir Goldstein wrote: >> On Thu, Apr 14, 2011 at 12:21 PM, Jan Kara <jack@xxxxxxx> wrote: >> > On Thu 14-04-11 10:12:26, Amir Goldstein wrote: >> >> On Thu, Apr 14, 2011 at 12:39 AM, Jan Kara <jack@xxxxxxx> wrote: >> >> > On Wed 13-04-11 21:16:40, Amir Goldstein wrote: >> >> >> On Tue, Apr 12, 2011 at 5:48 PM, Jan Kara <jack@xxxxxxx> wrote: >> >> >> > modification stamps have possibly larger race windows but I haven't really >> >> >> > tried how much (I just know that even mtime races are not that hard to >> >> >> > trigger if you try). So it really depends on how big reliability do you >> >> >> > expect and I personally don't find much value in just rescanning and >> >> >> > checking for mtime after a crash. Reading all the data and doing checksum >> >> >> > certainly has more value but at a high cost. >> >> >> > >> >> >> >> >> >> What do you thing about the approach to store recursively modified dir inodes in >> >> >> a journal "modified inode descriptor block" and update the recursive mtime of >> >> >> those dirs on journal recovery? >> >> > The trouble is you don't know the number of directories that may need >> >> > to have timestamp updated - you find that out only as you travel upwards. >> >> > So it's hard to reserve any fixed space for this. >> >> > >> >> >> >> True, but you can save *so* many inode numbers in just one descriptor >> >> block and in case of an overflow, we can just pass a hint to the top >> >> level application to do a full directory scan, so I hardly see that as a >> >> big problem. >> > Well, about 1000 but you can still have about 8000 inodes modified in a >> > transaction for a standard 128 MB journal. You can notify the userspace >> > when an overflow happens but the interface gets kind of ugly... Also it >> > would be only specific to ext3/4 while I'd prefer to get a wider fs >> > support. >> >> Well, the persistent inode notification (by the way a feature provided by NTFS), >> can be specific to ext4, but it can work together with a generic recursive mtime >> code. >> ext4 will simply touch directories during journal recovery. >> other fs will only have the generic runtime recursive mtime. > But then applications cannot rely on the behavior and cannot take much > benefit from it. Well, they could still ignore the risks and ext4 would be > nicer to them. But I'm not really sure what are you aiming at... I am just aiming at making mtime (or recursive mtime) more reliable. it may already be more reliable in other fs, so every fs can try to make it more reliable internally. > >> >> >> I would also consider to use a mount option rec_mtime and then just >> >> >> store recursive >> >> >> mtime in the directory's inode mtime instead of an extended attribute. >> >> >> That doesn't break any contract with user space, it's just a re-interpretation >> >> >> of the dir modification notion. >> >> > It breaks POSIX specification - POSIX pretty much specifies when mtime is >> >> > supposed to be changed - so I'm not sure we really want to do that... >> >> >> >> I disagree, POSIX doesn't forbid a user space daemon from touching directory >> >> inodes and updating their mtime. The rec_mtime feature should be treated as >> >> a little kernel "daemon" which propagates information to user space by touching >> >> recursively modified directories. >> > OK, if you look at it this way it makes some sense. You loose the >> > distinction whether something has been created / deleted in the directory >> > or whether only something happened in its subdirectory or file but that >> > does not seem too important for any use case I can think of. >> >> Personally, whenever I look at a dir mtime I would much rather I see >> recursive mtime (I would much rather see recursive size as well but that >> is too much to ask). rsync can be easily modified to skipped entire >> directories if their (recursive) mtime hasn't changed. I would like to >> view dir (recursive) mtime using existing tools (from ls to folder >> manager) and not use specialized tools that look at extended attributes, >> but hey, that's just me :-) > That would be neat but note that even my patches don't provide complete > recursive mtime behavior. They just update the time stamp once and then > stop updating it until you ask about the update again. This makes the whole > framework really efficient for often modified directories but less useful > for cases like "I want to see time when something has changed in this > subtree". > > But still I kind of like your idea of hijacking directory mtime/ctime for > these purposes because it would make several things simpler. I have actually written a design that hijacks the dir atime for recursive mtime, before you told me about your patches (my design deos not have the one time modification trick). The advantage of hijacking atime, it that few (if any) application rely on it, since it is not reliable with noatime. But the greater advantage of hijacking mtime, is that it is persistent in file systems where atime isn't (i.e. vfat). > > Honza > -- > Jan Kara <jack@xxxxxxx> > SUSE Labs, CR > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html