On Wed, Feb 21, 2018 at 10:33:05PM +0100, Ævar Arnfjörð Bjarmason wrote: > This sounds like a sensible job for a git import tool, i.e. import a > target directory into git, and instead of 'git add'-ing the whole thing > it would look at the mtimes, sort files by mtime, then add them in order > and only commit those files that had the same mtime in the same commit > (or within some boundary). I think that this would be The Wrong Thing to do. The commit time is just that: The time the commit was done. The commit is an atomic group of changes to a number of files that hopefully bring the tree from one usable state into the next. The mtime, in contrast, tells us when a file was most recently modified. It may well be that main.c was most recently modified yesterday, and feature.c was modified this morning, and that only both changes taken together make sense as a commit, despite the long time in between. Even worse, it may be that feature A took a long time to implement, so we have huge gaps in between the mtimes, but feature B was quickly done after A was finished. Such an algorithm would probably split feature A incorrectly into several commits, and group the more recently changed files of feature A with those of feature B. And if Feature A and Feature B were developed in parallel, things get completely messy. > The advantage of doing this via such a tool is that you could tweak it > to commit by any criteria you wanted, e.g. not mtime but ctime or even > atime. Maybe, but it would be rather useless to commit by ctime or atime. You do one grep -r and the atime is different. You do one chmod or chown and the ctime is different. Those timestamps are really only useful for very limited purposes. That ctime exists seems reasonable, since it's only ever updated when the inode is written anyway. atime, in contrast, was clearly one of the rather nonsensical innovations of UNIX: Do one write to the disk for each read from the disk. C'mon, really? It would have been a lot more reasonable to simply provide a generic way for tracing read() system calls instead; then userspace could decide what to do with that information and which of it is useful and should be kept and perhaps stored on disk. Now we have this ugly hack called relatime to deal with the problem. > You'd get the same thing as you'd get if git's tree format would change > to include mtimes (which isn't going to happen), but with a lot more > flexibility. Well, from basic logic, I don't see how a decision not to implement a feature could possibly increase flexility. The opposite seems to be the case. Best wishes Peter -- Peter Backes, rtc@xxxxxxxxxxxxxxxxxxx