Re: [PATCH v4 04/15] scalar: 'register' sets recommended config and starts maintenance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Elijah,

On Mon, 27 Sep 2021, Elijah Newren wrote:

> On Tue, Sep 14, 2021 at 7:39 AM Derrick Stolee via GitGitGadget
> <gitgitgadget@xxxxxxxxx> wrote:
> ...
> > +static int set_recommended_config(void)
> > +{
> > +       struct {
> > +               const char *key;
> > +               const char *value;
> > +       } config[] = {
> > +               { "am.keepCR", "true" },
> > +               { "core.FSCache", "true" },
> > +               { "core.multiPackIndex", "true" },
> > +               { "core.preloadIndex", "true" },
> > +#ifndef WIN32
> > +               { "core.untrackedCache", "true" },
> > +#else
> > +               /*
> > +                * Unfortunately, Scalar's Functional Tests demonstrated
> > +                * that the untracked cache feature is unreliable on Windows
> > +                * (which is a bummer because that platform would benefit the
> > +                * most from it). For some reason, freshly created files seem
> > +                * not to update the directory's `lastModified` time
> > +                * immediately, but the untracked cache would need to rely on
> > +                * that.
> > +                *
> > +                * Therefore, with a sad heart, we disable this very useful
> > +                * feature on Windows.
> > +                */
> > +               { "core.untrackedCache", "false" },
> > +#endif
>
> Interesting.  (I'm somewhat leery of the untrackedCache just knowing
> that it used to operate despite an exponential number of visits to
> files (exponential in depth of directories) and getting different
> answers with different visits, making me feel like it was black magic
> that it ever worked and wondering what kind of corner case issues
> still lurk with it.  See e.g.
> https://lore.kernel.org/git/CABPp-BFiwzzUgiTj_zu+vF5x20L0=1cf25cHwk7KZQj2YkVzXw@xxxxxxxxxxxxxx/)

The implementation of the untracked cache certainly is quite a challenge
to wrap one's head around, for sure. However, it does manage to speed up
operations substantially (when it works).

The real fun starts when you turn on the FSMonitor, though. Then it is
reliable, all of a sudden! The reason seems to be some sort of delayed
lastModified (AKA mtime) evaluation which is somehow triggered by
FSMonitor ;-)

So in microsoft/git, where we include FSMonitor and turn it on as part of
`scalar clone`, we also enable the untracked cache, for noticeably happier
users.

> > +               { "core.logAllRefUpdates", "true" },
> > +               { "credential.https://dev.azure.com.useHttpPath";, "true" },
>
> Not only opinionated, but special configuration for certain sites?
> I'm not complaining, just slightly surprised.

Yes. I am not aware of other sites where you would want to use different
credentials depending on the URL path, but Azure DevOps definitely is such
a site, and therefore needs `useHttpPath`. Rather than requiring users to
know this, we set it for them.

> > +               { "credential.validate", "false" }, /* GCM4W-only */
> > +               { "gc.auto", "0" },
> > +               { "gui.GCWarning", "false" },
> > +               { "index.threads", "true" },
> > +               { "index.version", "4" },
>
> I take it your users don't make use of jgit?

Nope ;-) I doubt that the features we use to make Git scalable are
implemented in JGit.

> (Users aren't using jgit directly here, at least not to my knowledge,
> but multiple gradle plugins do.)  I tried turning this on a while back,
> and quickly got multiple reports of problems because jgit didn't
> understand the index. I had to turn it off and send out various PSAs on
> how to recover.

TBH it gives me shivers of dread thinking about large
repositories/worktrees being handled within a Java VM. The amount of,
let's call it "non-canonical" code, required by JGit to make it somewhat
performant, is staggering. Just think about the way you have to emulate
mmap()ing part of a packfile and interpreting it as a packed C struct. I
forgot the details, of course, and I am quite glad that I did.

> > +               { "merge.stat", "false" },
> > +               { "merge.renames", "false" },
>
> Is this just historical and not needed anymore, is it here just for a
> little longer and you are planning on transitioning away from this, or
> are you still set on this setting?

It is here mostly for historical reasons.

> > +               { "pack.useBitmaps", "false" },
>
> I don't understand anything bitmap related, but I thought they were
> performance related, so I'm surprised by this one.  Is there a reason
> for this one?  (Is it handled by maintenance instead?)

Again, this is here for historical reasons. Scalar sets this, and my goal
with this patch series is to port it from .NET to C. So I did not question
the reasoning.

My _guess_ however is that bitmaps really only work well when everything
is in one single pack. Which is rather not the case with Scalar
enlistments: they are way too large to be repacked all the time.

> > +               { "pack.useSparse", "true" },
> > +               { "receive.autoGC", "false" },
> > +               { "reset.quiet", "true" },
> > +               { "feature.manyFiles", "false" },
>
> If you simply set core.untrackedCache to false _after_ setting
> feature.manyFiles to true, would it make sense to switch this?  (Or
> does it matter, since you've already individually set all the config
> settings that this one would set?)

Frankly, I was a bit puzzled why `feature.manyFiles` was set to `false`.
The rationale is explained in
https://github.com/microsoft/scalar/commit/2fc84dba9c95:

	The feature.* config settings change the defaults for some other
	config settings. We already monitor config settings pretty carefully,
	so let's disable these.

As to switching this, it shouldn't matter. The idea of `feature.*` is to
set defaults, but not override any explicitly configured settings.

> > +               { "feature.experimental", "false" },
> > +               { "fetch.unpackLimit", "1" },
> > +               { "fetch.writeCommitGraph", "false" },
> > +#ifdef WIN32
> > +               { "http.sslBackend", "schannel" },
> > +#endif
> > +               { "status.aheadBehind", "false" },
> > +               { "commitGraph.generationVersion", "1" },
> > +               { "core.autoCRLF", "false" },
> > +               { "core.safeCRLF", "false" },
> > +               { NULL, NULL },
> > +       };
>
> Are there easy-ish ways for other groups of users to adopt scalar but
> change the list of config settings (e.g. index.version and
> merge.renames) in some common way for all those users?

Not in Scalar.

I would hope, however, that we could figure out ways to make this more
configurable when re-implementing this functionality in core Git. I have a
couple ideas, but nothing fleshed out, and besides, I do not want to think
too far ahead, I already made that mistake and then got bogged down in
discussions about minimal vs non-minimal changes in the top-level Makefile
;-)

So yeah, good point, but it's probably not a good time yet to discuss this
tangent.

Thank you for reviewing,
Dscho




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux