Re: Committing crimes with NTFS-3G

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 29, 2024 at 11:43:40PM +0300, Roman Sandu wrote:
> Good day!
> 
> I have a decently sized (80K files) monorepo on an NTFS drive that I have
> been working with for a while under Windows via git-for-windows. Recently, I
> had to (temporarily) switch to Ubuntu (24.04) via dual-boot for irrelevant
> reasons, and I decided that simply mounting my NTFS drive and using the
> monorepo from Ubuntu is a great idea, actually, as NTFS-3G allow for
> seamless interop with NTFS via UserMapping. And so that is exactly what I
> did and It Just Works!
> 
> Except it kind of does not. Every time I run `git status` it takes 8
> seconds, which is very painful when doing tricky history rewriting.
> 
> To diagnose the problem, I ran git status with GIT_TRACE_PERFORMANCE
> enabled, and what I see is that the "refresh index" region is taking up 99%
> of the time. Digging further, `strace -fc git status` tells me that 99% of
> the time is spent on newfstatat'ing files. Okay, makes sense, stat'ing files
> through FUSE is not all that quick. But how many files are we talking about?
> My repository has `feature.manyFiles` enabled in git, so I would expect
> `core.untrackedCache` make it so that `git status` skips basically
> everything except for the root folder which contains, what, 20 subfolders?
> But it actually does >96K stat calls! Which is more than the amount of files
> in the repository in total. Briefly looking at the output of `strace -f git
> status`, I see that git indeed goes through basically all of the repository,
> even things that have not changed for years, as if `core.untrackedCache` is
> not actually enabled. Manually enabling it on top of `feature.manyFiles`
> does not help. Note that `git update-index --test-untracked-cache` tells me
> that mtime does indeed work, and I've also manually stat'ed some folders
> which `git status` re-stats on every run and I see that the modify time is
> indeed a couple of hours ago, yet even when running `git status` several
> times in a row it re-scans the entire folder every time.
> 
> So, what do I do about this? It honestly looks like a git bug to me, maybe
> it silently fails to update the index with new timestamps because it was
> initially created on Windows? But I have no clue how to narrow this issue
> down further, so any ideas or suggestions would be appreciated!
> 

It was pretty big news that Paragon's read-write NTFS driver was merged
into the kernel.  You might want to give that a try if your main problem
is performance.

https://lore.kernel.org/lkml/CAHk-=wjn4W-7ZbHrw08cWy=12DgheFUKLO5YLgG6in5TA5HxqQ@xxxxxxxxxxxxxx/

Regards,
Vito Caputo




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux