Re: [PATCH] git-gui: Perform rescan on window focus-in

Johannes Schindelin <Johannes.Schindelin@xxxxxx> · Sun, 4 Aug 2019 21:10:13 +0200 (CEST)

Hi,

On Sun, 4 Aug 2019, Pratyush Yadav wrote:

> On 8/4/19 2:04 AM, Johannes Schindelin wrote:
> >
> > On Sat, 3 Aug 2019, Pratyush Yadav wrote:
> >
> > > On 8/2/19 6:09 PM, Johannes Schindelin wrote:
> > > >
> > > > On Fri, 2 Aug 2019, Pratyush Yadav wrote:
> > > >
> > > > > On 8/1/19 1:12 AM, Johannes Schindelin wrote:
> > > > > >
> > > > > > I would be _extremely_ cautious to base an argument on one
> > > > > > particular setup, using on particular hardware with one
> > > > > > particular OS and one particular repository.
> > > > > >
> > > > >
> > > > > Agreed. That's why I asked for benchmarks from other people.
> > > > > Unfortunately, no one replied.
> > > >
> > > > What stops _you_ from performing more tests yourself? There are
> > > > tons of real-world repositories out there, we even talk about
> > > > options for large repositories to test with in Git for Windows'
> > > > Contributing Guidelines:
> > > > https://github.com/git-for-windows/git/blob/master/CONTRIBUTING.md#performance-tests
> > >
> > > I thought the point was to not base all data off a single setup?
> >
> > You misunderstood what I was saying: a single setup is bad, and you
> > can make it moderately better by testing _at least_ with a
> > moderately-sized repository [*1*] in addition to git.git.
> >
> > So yes, it would still not be enough to test with, say, the git.git
> > _and_ the Chromium repository _only_ on your setup, but if not even
> > you can be bothered to test with more than one small repository, how
> > can you possibly expect anybody else to add more testing?
>
> All right, I'll see what repos I can test.
>
> But my internet is pretty slow and unstable, so my clone of the
> Chromium repo failed mid-way multiple times. I assume we need to test
> on a large index, so is it all right if I use
> t/perf/repos/many-files.sh to artificially generate a large repo?

Why do you ask me for permission to just try this? I feel very
uncomfortable being put in such a position: I am not your manager or
gate-keeper or anything.

> > > [...]
> > > > I wonder, however, whether you can think of a better method to
> > > > figure out when to auto-refresh. Focus seems to be a low-hanging
> > > > fruit, but as you noticed it is not very accurate. Maybe if you
> > > > combine it with a timeout? Or maybe you can detect idle time in
> > > > Tcl/Tk?
> > >
> > > Hm, I don't see a better alternative than file system watches.
> > > Timeouts are a heuristic that can potentially be problematic.
> >
> > Let me stress the fact that I suggested a timeout _in addition_ to the
> > focus event?
>
> Oh, my bad. I thought you suggested using timeouts exclusively.
>
> But I'm not sure I understand what you mean by "using timeouts in addition to
> the focus event". My guess is that you mean we should activate a
> refresh-on-focus-in only after git-gui has been out of focus for a certain
> amount of time. Is my guess correct?

I am _not_ telling you what strategy you should use. You really need to
come up with hypotheses about what tell-tales for committable outside
changes could be easy to detect. This is your patch, and your project.

My suggestion about a time-out was to think a bit further than just mere
Tk-provided events to detect whether the user might have changed
anything outside of Git GUI that might make an automatic refresh
convenient for the user.

I do _not_ want to engage in this project, it is not my pet peeve.

> > Yes, using a timeout on its own is stupidly problematic. That's why I
> > did not suggest that.
> >
> > > If you do a refresh too frequently, you hog up the user's resources
> > > for little benefit.
> >
> > Indeed. You want to find a heuristic that catches most of the cases
> > where files were changed, while at the same time not even _trying_ to
> > refresh automatically when probably nothing changed.
>
> Like I said before, the best way of doing that that I can see is file system
> watches.

That's not a heuristic.

A file system monitor is doing a lot of work in this case, for dubitable
benefit.

Take for example git.git: Let's say that I run a specific test. It
creates a directory under `t/`: the filesystem monitor triggers. It
creates a repository in that directory: the filesystem monitor triggers
_multiple times_. The test then performs all kinds of I/O, eventually
removing the directory when all tests passed.

Note that none of these filesystem changes correspond to anything that
would update _anything in Git GUI during a refresh.

Of course, this is something I did not mention before because I took it
for granted that you would always try to weigh the benefits of your
approach to the worst possible unintended consequences.

> But maybe we can get reasonable performance with a combination of
> timeouts and focus events.

Please note that I would not be surprised if this heuristic _also_
resulted in a lot of bad, unintended consequences. That's for you to
find out.

> > Footnote *1*: I don't expect you to test with the largest repositories,
> > as you are unlikely to have access to anything approaching the size of
> > the largest Git repository on the planet:
> > https://devblogs.microsoft.com/bharry/the-largest-git-repo-on-the-planet/
> >
>
> Ah yes, I read about it a while back on Reddit. Having a huge monolithic repo
> sounds backwards to me. Using submodules sounds like a better idea, but who am
> I to judge. They probably have their reasons that I'm not aware of.

This statement just sounds to me as if you never have used submodules in
any serious way. My experience is that software developers who tried to
use submodules offer opinions that read very differently from that
paragraph.

Strong opinions usually do not survive contact with open-minded exposure
to reality.

Ciao,
Johannes