Re: [PATCH v3 4/4] Documentation: add lint-fsck-msgids

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 27 2022, Jeff King wrote:

> On Wed, Oct 26, 2022 at 01:35:43PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > (I had a similar thought after getting hit on the head by the recent
>> > t0450-txt-doc-vs-help.sh).
>> 
>> Sorry about that. FWIW I've wanted to assert that for a while, and to do
>> it by e.g. having the doc *.txt blurbs generated from running "$buildin
>> -h" during the build.
>> 
>> But when I suggested that I gathered That Junio wasn't up for that
>> approach, it does have its downsides thorugh. E.g. to build the docs
>> you'd now have to compile C code, and e.g. that git-scm.com publisher
>> written in Ruby would have to compile the code to generate its docs.
>
> Yes, it would definitely break the git-scm.com importer. It might be
> possible to make it work by actually running "make man" (or maybe even
> some partial targets) in a local repo. The nightly update job pulls all
> of the data using GitHub's API, but it does run in a heroku dyno that
> has git and gcc. Doing a shallow clone of the tag to build is probably
> more expensive, but the cost of an actual build isn't that important.
> 99% of the scheduled job runs are noops because there's no new version
> of Git to build manpages for; as long as those remain cheap-ish, we're
> OK.
>
> I also think in the long run that the site should move to a system that
> builds off a local repo, which we can trigger manually or via GitHub
> Actions. But that doesn't exist yet, and I don't think anyone's actively
> working on it.
>
> I also think it would be nice if the git-scm.com system relied more on
> "make man USE_ASCIIDOCTOR=1", possibly by hooking into
> asciidoctor-extensions.rb or similar, rather than munging files in an
> ad-hoc manner. But that's also a big change that nobody is actively
> working on.
>
> All of which is to say that yes, having docs depend on C code will cause
> work on the site, but it may be work that takes us in the right
> direction.

Good to know.

>> Or we could do it the other way around, and scape the *.txt for the *.c
>> generation, but then we need to run a new script for building
>> builtin/*.o. Also possible, and I think eventually both are better than
>> what t0450-txt-doc-vs-help.sh's doing now, but that was a less invasive
>> change than both...
>
> I think this particular case is tricky in that direction, because it's a
> big set of dependencies that aren't necessarily one-to-one. E.g.,
> builtin/log.c needs to depend on git-log.txt, but also on git-show.txt
> and git-format-patch.txt.

I was thinking of just a generated usage-strings.c or whatever. I.e. one
file with every usage string in it. Then you don't need to hunt down
which thing goes in which file. We'll just have a variable named
"builtin_format_patch_usage". Include it in "builtin.h" and it doesn't
matter if that's in builtin/log.c or builtin/format-patch.c.

It does mean you need to rebuild the C code for other built-ins every
time one of the SYNOPSIS sections changes, but doesn't happen too often.

>> >> +# Unfortunately our dependency management of generated headers used
>> >> +# from within other headers suck, so we'll need to list every user of
>> >> +# fsck.h here, but not too bad...
>> >> +FSCK_MSGIDS_H_BUILTINS = fsck index-pack mktag receive-pack unpack-objects
>> >> +$(foreach f,$(FSCK_MSGIDS_H_BUILTINS:%=builtin/%),$f.sp $f.s $f.o): fsck-msgids.h
>> >> +FSCK_MSGIDS_H_LIBS = fetch-pack fsck
>> >> +$(foreach f,$(FSCK_MSGIDS_H_LIBS),$f.sp $f.s $f.o): fsck-msgids.h
>> >
>> > I don't understand the "used within other headers" part here. Computed
>> > dependencies will get this right. It's only the initial build (before we
>> > have any computed dependencies) that needs to know that the C files in
>> > question depend on the generated file. But that is true whether they do
>> > so directly or indirectly.
>> 
>> I forgot the full story, but luckily I wrote it down in the WIP commits
>> :) FWIW if you want to scour that it's mainly:
>> 
>> 	https://github.com/avar/git/commit/a00f1cb9ea5 # add advice-type.h
>> 	https://github.com/avar/git/commit/9e080923a11 # generate 'struct advice
>
> That seems like the same problem to me. It's just that if something is
> in cache.h, then it's needed by basically everything, and so the
> dependency list is big.

I think I either did or was planning to take it out of cache.h as part
of that, we put way too much in cache.h.

Even advice.c depends on cache.h for its advice.h *sigh*.

Trying it just now putting advice.h in builtin.h instead leaves 10
top-level files not-compiling (and they just need the header).

I think it's fine to include common utilties in our "included by
everything" headers, but if we've got ~500 *.c files and something is
only needed by ~20 of them (as in this case) we should probably just
include it in those 20.

>> Anyway, in partial summary, why (and sorry this got so long):
>> 
>>  * Yes, once we build e.g. fsck.o we have the full dependency tree,
>>    yay....
>>  * ...but only for gcc & clang, but we support other compilers.
>>  * ...for other compilers (or gcc & clang without the dep detection
>>    enabled) what should this do:

I didn't summarize this well enough, I should have said that whether you
have computed deps or not that.....

>> 
>> 	make file-that-does-not-use-generated-header.o
>> 
>>    It sucks a bit to have e.g. every *.o depend on command-list.h, when
>>    we only use it in 1-2 places, ditto the other headers.
> But that is already true of non-generated headers. If your system
> doesn't support computed deps, then all objects depend on all headers.

... this does not build e.g. command-list.h:

	make clean
	make grep.o

But this does:

	make clean
	make help.o

Because we've manually declared that.

[Re-arranging a bit]

> That all comes from b8ba629264 (Makefile: fold MISC_H into LIB_H,
> 2012-06-20).

Yes, but you get that for free whether you have computed deps or
not. I.e. your compiler is about to build the code anyway.

But it wasn't about to run a bunch of shellscripts to generate headers
it doesn't actually need.

> Yes, that sucks for you. But almost nobody actively develops on such a
> system, and people building Git to use rarely notice (because they are
> doing an initial build, or jumping versions, both of which generally
> require recompilation anyway).

I guess I'm "almost nobody" :) Not because I don't use computed deps,
but because I tend to e.g. do large interactive rebases with:

	git rebase -i 'make grep.o'

Or some other subset of our built assets/objects.

On that front no one thing is the end of the world, i.e. sure, if we run
some shellscript to make a needed-header.h that takes 10ms we can live
with it.

But it adds up, which is one reason I've been slowly trickling in
patches to optimize various common parts of the Makefile & those
scripts.

I think before I started that the fixed cost even to have just make
decide it had nothing to do was often 500ms-1s.

I think that's down to 1/4 or so of that on "master", and I've got some
local patches to get it consistently down to <50ms.

It adds up, especially when combined with ccache & the like. I.e. if i'm
testing 10 commits I happen to have cached a "rebase -i" goes from
noticeably slow (~10s) to not being so slow as to be distracting.

Anyway, all of which is to say that that's one thing I was aiming for:
If I was going to submit patches for new generated headers to find some
good trade-off between complexity and over-declaring dependencies.

>>  * We also have e.g. "make hdr-check", which works just fine now, but
>>    it's becausue we have no *.h file that uses another generated *.h
>>    file.
>
> I'm not that surprised. Probably it should depend on $(GENERATED_H), and
> possibly even be checking those files too.
>
>>  * I may be wrong (I'm just looking at some old "here be dragons" note
>>    of mine), but I think there's also an edge case where the dependency
>>    graph of .depend.d/* is correct *once you've always had it*, but if a
>>    file now e.g. depends on foo.h, and I make foo.h include a new
>>    foo-generated.h things go boom.
>
> That should work because the timestamp on foo.h will have been updated,
> causing anything that includes it to rebuild (and thus updating its
> computed dependencies to include foo-generated.h).

Yes, in general it should work, maybe there's some Heisenbug there. I
can't recall or reproduce it, sorry.

But that's a good thing, maybe there's no issue :)




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux