Re: Fastest way to set files date and time to latest commit time of each one

Eric Wong <e@xxxxxxxx> · Sat, 29 Aug 2020 04:48:42 +0000

Ivan Baldo <ibaldo@xxxxxxxxx> wrote:
>   Hello.
>   I know this is not standard usage of git, but I need a way to have
> more stable dates and times in the files in order to avoid rsync
> checksumming.
>   So I found this
> https://stackoverflow.com/questions/2179722/checking-out-old-file-with-original-create-modified-timestamps/2179876#2179876
> and modified it a bit to run in CentOS 7:
> 
> IFS="
> "
> for FILE in $(git ls-files -z | tr '\0' '\n')
> do
>     TIME=$(git log --pretty=format:%cd -n 1 --date=iso -- "$FILE")
>     touch -c -m -d "$TIME" "$FILE"
> done
> 
>   Unfortunately it takes ages for a 84k files repo.
>   I see the CPU usage is dominated by the git log command.

running git log for each file isn't necessary.

On Debian, rsync actually ships the `git-set-file-times' script
in /usr/share/doc/rsync/scripts/ which only runs `git log' once
and parses it.

You can also get my (original) version from:
https://yhbt.net/git-set-file-times

>   I know a way I could use to split the work for all the CPU threads
> but anyway, I would like to know if you guys and girls know of a
> faster way to do this.

Much of your overhead is going to be from process spawning.
My Perl version reduces that significantly.

I haven't tried it with 84K files, but it'll have to keep all
those filenames in memory.  I'm not sure if parallelizing
utime() syscalls is worth it, either; maybe it helps on SSD
more than HDD.