Re: [PATCH v3 00/34] Builtin FSMonitor Feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 05 2021, Johannes Schindelin wrote:

> On Thu, 1 Jul 2021, Ævar Arnfjörð Bjarmason wrote:
>
>> On Thu, Jul 01 2021, Jeff Hostetler wrote:
>>
>> > On 7/1/21 1:40 PM, Ævar Arnfjörð Bjarmason wrote:
> [...]
>> > The early Linux version was dropped because inotify does not give
>> > recursive coverage -- only the requested directory.  Using inotify
>> > requires adding a watch to each subdirectory (recursively) in the
>> > worktree.  There's a system limit on the maximum number of watched
>> > directories (defaults to 8K IIRC) and that limit is system-wide.
>> >
>> > Since the whole point was to support large very large repos, using
>> > inotify was a non-starter, so I removed the Linux version from our
>> > patch series.  For example, the first repo I tried it on (outside of
>> > the test suite) had 25K subdirectories.
>> >
>> > I'm told there is a new fanotify API in recent Linux kernels that is a
>> > better fit for what we need, but we haven't had time to investigate it
>> > yet.
>>
>> That default limit is a bit annoying, but I don't see how it's a blocker
>> in any way.
>
> Let me help you to see it.
>
> So let's assume that you start FSMonitor-enabled Git, with the default
> values. What is going to happen if you have any decently-sized worktree?
> You run out of handles. What then? Throw your hands in the air? Stop
> working? Report incorrect results?
>
> Those are real design challenges, and together with the race problems Jeff
> mentioned, they pose a much bigger obstacle than the rebasing you
> mentioned above.

You report an error and tell the user to raise the limit, and cover this
in your install docs. It's what watchman does:

https://github.com/facebook/watchman/blob/master/watchman/error_category.cpp#L28-L45
https://facebook.github.io/watchman/docs/install.html#system-specific-preparation

>> You simply adjust the limit. E.g. I deployed and tested the hook version
>> of inotify (using watchman) in a sizable development environment, and
>> written my own code using the API. This was all before fanotify(7)
>> existed. IIRC I set most of the limits to 2^24 or 2^20. I've used it
>> with really large Git repos, including with David Turner's
>> 2015-04-03-1M-git for testing (`git ls-files | wc -l` on that is around
>> a quarter-million).
>>
>> If you have a humongous repository and don't have root on your own
>> machine you're out of luck. But I think most people who'd use this are
>> either using their own laptop, or e.g. in a corporate setting where
>> administrators would tweak the sysctl limits given the performance
>> advantages (as I did).
>
> This conjecture that most people who'd use this are using their own laptop
> or have a corporate setting where administrators would tweak the sysctl
> limits according to engineers' wishes strikes me as totally made up from
> thin air, nothing else.
>
> In other words, I find it an incredibly unconvincing argument.

It's from a sample size of one experience of deploying this in a BigCorp
setting.

But sure, perhaps things are done differently where you work/have
worked. My experience is that even if you're dealing with some BOFHs and
e.g. are using shared racked development servers it's generally not an
insurmountable task to get some useful software installed, or some
system configuration tweaked.

In this case we're talking about ~40MB of kernel memory for 1 million
dirs IIRC, that coupled with the target audience that benefits most from
this probably being deployments that are *painfully* aware of their "git
status" slowness...

> I prefer not to address the rest of your mail, as I found it not only a
> lengthy tangent (basically trying to talk Jeff into adding Linux support
> in what could have been a much shorter mail), but actively distracting
> from the already long patch series Jeff presented. It is so long, in fact,
> that we had to put in an exemption in GitGitGadget because it is already
> longer than a usually-unreasonable 30 patches. Also, at this point,
> insisting on Linux support (in so many words) is unhelpful.

This part of the tread started because Jeff H. claimed upthread:

    [...]inotify was a non-starter, so I removed the Linux version from
    our patch series

But after I noted that it works just fine, you just need to change some
sysctl limits.

It seems at this point we're debating whether some installations of
Linux have BOFH-y enough administrators that they won't tweak sysctl
limits for you. OK, but given that I've run this thing in a production
setting it's clearly not a "non-starter". I think it could be useful for
a lot of users.

I'll reply with more (and hopefully helpful) specifics to Jeff's mail.

> Let me summarize why I think this is unhelpful: In Git, it is our
> tradition to develop incrementally, for better or worse. Jeff's effort
> brought us to a point where we already have Windows and macOS support,
> i.e. support for the most prevalent development platforms (see e.g.
> https://insights.stackoverflow.com/survey/2020#technology-developers-primary-operating-systems).
> We already established multiple obstacles for Linux support, therefore
> demanding Linux support to be included Right Now would increase the patch
> series even further, making it even less reviewable, being even less
> incremental, hold up the current known-to-work-well state, force Jeff to
> work on something he probably cannot work on right now, and therefore
> delaying the entire effort even further.

I think we just disagree. I wouldn't call my opinion "unhelpful" any
more than yours.

I don't think Git's ever had anything like a major feature (built in,
config settings, etc. etc.) contributed by a propriterary OS vendor that
works on that vendor's OS, as well as another vendor's propriterary OS,
but not comparable free systems

Is that less incremental and perhaps less practical? Sure. It's not an
entirely practical viewpoint. I work on free software partly for
idealistic reasons. I'd prefer if the project I'm working on doesn't
give users a carrot to pick proprietary systems over free ones.

But ultimately it's not my call, but Junio's.







[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux