Re: Preparing for ocaml 4.11

Jerry James <loganjerry@xxxxxxxxx> · Thu, 25 Jun 2020 16:21:18 -0600

On Thu, Jun 25, 2020 at 4:16 AM Dan Čermák
<dan.cermak@xxxxxxxxxxxxxxxxxxx> wrote:
> %generate_buildrequires could be used to parse the opam file from the
> tarball and extract the dependent opam packages from the depends:
> array. E.g. the following:

I've been thinking along those lines as well.  More automation would be great.

> The hypothetical opam2spec would be certainly also a huge improvement,
> as I think that most ocaml packages could be completely autogenerated.

My experience is that the build dependencies in the opam files are of
high quality, but that test and documentation dependencies are often
omitted.  We could add those manually, so that may not be a big deal.

I have found myself copying lots of boilerplate from spec file to spec
file while adding packages.  I think I've only encountered a few
variations, which seem to be centered around the chosen build tool:
- dune
- topkg
- ocamlbuild
- ocamlc/ocamlopt

with increasing entropy as you go down the list. :-)

> But as you already said, the devil is most likely in the details, so
> maybe we should start collecting ideas and potential pitfalls first? I
> have therefore also cc'd Olaf, who takes care of Ocaml in
> openSUSE. I would hope that we can coordinate our efforts, as Olaf has
> some pretty nice ocaml macros that automatically generate file lists for
> the subpackages (thereby already simplifying packaging quite a bit).

By all means, let's draw from the experiences of those who have
traveled this road already.  Olaf, thanks for chiming in.  I'm eager
to learn from your experiences.

I'm glad the idea of opam2spec was brought up.  I've been thinking
about that, and a couple of other tools:

Tool 1: visualize package dependencies

The idea is to pull dependency information from upstream sources [1],
then draw a graph representing the dependencies.  We distinguish
between different kinds of dependencies:
- build
- documentation
- test
and further distinguish each dependency as "required" or "optional".
The graph uses colors, line styles, or both to distinguish between the
various dependency types.

I want an option to highlight dependency loops (perhaps cycling
through them one at a time if there is more than one).  There sure are
a lot of those in the OCaml world (when testing and documentation
dependencies are included).

Finally, I want the tool to help me visualize the effects of various
strategies for breaking loops.  That is, I want the tool to
automatically find some way of breaking loops so that a directed
acyclic graph is produced.  For example, I might say that when loops
are broken, the edges to remove are chosen to minimize the number of
edge removals, and use this preference order on the edge types (from
most preferred to least preferred type to remove):
- optional test
- required test
- optional documentation
- required documentation
- optional build
- required build

The tool should make it easy for me to select different orderings of
the preferred types, and give me a count of the number of removed
edges in each case.  Finding a minimal set of edges to remove makes
this an expensive operation in the general case, but I expect that the
typical use of such a tool would be on a few hundred nodes at most,
which shouldn't be a big deal on modern computers.

This tool will help me decide which dependencies should be hidden
behind a %bcond_with of one kind or another.  In particular, I'm
interested in using %bcond_with bootstrap because of the next tool.

Tool 2: compute a build order

Such a tool has been mentioned on this mailing list a number of times.
Richard Jones has a script that does this for OCaml, but I think it
could be generalized to RPM packages in general with some effort.  It
would work something like this:

Stage 1: read spec files (possibly specified as a list of checked-out
git repositories, like Richard's script) to generate two lists per
input:
- BuildRequires when --with-bootstrap is given
- BuildRequires when --without-bootstrap is given
If the two lists are identical, then discard the --with-bootstrap list.

Stage 2: Depsolve to convert each list into a list of package names;
i.e. convert logical, file, and boolean dependencies into explicit
package names.  If there are two lists, check again if they are
identical (they shouldn't be, but let's be careful), and discard the
--with-bootstrap list if so.

Stage 3: Construct a graph with one node for each list, labeled with
the name of the package/spec file from which the list was generated.
If it is a --with-bootstrap list, then give the node type BOOTSTRAP,
otherwise give it type NORMAL.

Stage 4: For each list L and each package P in L, if there are no
nodes in the graph corresponding to P, then use repoquery to get a
list of --without-bootstrap BuildRequires for P.  Make a node for P
and give it type DONOTBUILD.

Stage 5: For each node N in the graph, if N's list of BuildREquires
contains another node M in the graph, then add an edge from M to N.
If there are both BOOTSTRAP and NORMAL nodes for M, then make the edge
start at the BOOTSTRAP node.

Stage 6: Run a cycle detector on the graph.  If any cycles are found,
report them and exit.  Somebody hasn't broken a cycle with %{with
bootstrap} and has to fix that before we can continue.  See tool 1.
If we have not exited, then the graph is a DAG (directed acyclic
graph).

Stage 7: For each BOOTSTRAP node N in the graph, if the NORMAL node M
with the same name is not reachable from N, then:
- Merge all BuildRequires listed on N into M.
- If M now has BuildRequires on both the BOOTSTRAP and NORMAL versions
of some package, then discard the BOOTSTRAP BuildRequires.
- Discard N.
The graph is still a DAG because otherwise M would have been reachable from N.

Stage 8: Traverse the graph in DAG order; i.e., starting from nodes
with no incoming edges.  For each DONOTBUILD node N visited:
- Merge all of N's incoming edges (if any) into the nodes on the other
side of its outgoing edges.
- Remove N and all outgoing edges from the graph.

At this point, we have a DAG giving a partial order on the packages to
be built, each one labeled either BOOTSTRAP or NORMAL.  Build them any
way you like that is consistent with the partial order and the node
types.

I may have overthought this a bit. :-)  If either tool exists
somewhere, even in an unfinished state, tell me!  Otherwise, since I'm
approximately the world's worst at choosing project names [2],
somebody do that and I'll chip in with some code.

Footnotes:
[1] We would pull from opam.ocaml.org in this case, but if the tool is
designed well, it could have "plugins" or "modules" that know how to
read from a variety of sources.  I'm interested in having this work
with the GAP packages I maintain, for example, which also have lots of
dependency loops involving test and documentation.
[2] For tool 1 ... dagomatic?  See, I told you!
-- 
Jerry James
http://www.jamezone.org/
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx