Fedora 33 Self-Contained Change proposal: Automatic RPM dependencies on Python Extras

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



https://fedoraproject.org/wiki/Changes/PythonExtras

== Summary ==
The Python RPM dependency generator (that generates
<code>python3.Xdist(foo)</code> requirements) will be adapted to also
generate requirements on
[https://www.python.org/dev/peps/pep-0508/#extras Python extras] (e.g.
<code>python3.Xdist(foo[bar])</code>) whenever upstream metadata
indicate such dependency. An easy opt out mechanism will exist. A
supported way of adding metapackages that provide such Python extras
(e.g. <code>python3.Xdist(foo[bar])</code>) will be introduced. Change
owners will add the missing metapackages that would otherwise cause
broken dependencies (in non-modular packages).

== Owner ==
* Name: [[User:Torsava|Tomáš Orsava]]
* Name: [[User:Churchyard|Miro Hrončok]]
* Email: <python-maint@xxxxxxxxxx>



== Detailed Description ==

=== The problem ===

[https://www.python.org/dev/peps/pep-0508/#extras Python extras] are a
way for a Python package (called "distribution" or "distribution
package" upstream) to declare that extra dependencies are required for
additional functionality.

For example Python package <code>requests</code> has several standard
dependencies (e.g. <code>urllib3</code>). But it also declares an
extra named <code>requests[security]</code> which lists additional
dependencies (e.g. <code>pyOpenSSL</code>) if you want to use this
additional functionality. The Python package code handles the missing
optional dependency gracefully -- e.g. it won't crash but might
instruct the user to install <code>requests[security]</code> if needed
by a warning or an actionable error message.

Python packages included in Fedora as RPMs automatically create a
special Provides in the format <code>python3.Xdist(foo)</code> (and
<code>python3dist(foo)</code>) where <code>foo</code> is the upstream
Python package distribution name and X is the Python minor version.
That way you can require any Python package without knowing under
which name it was packaged in Fedora. And these tags are also
automatically used by the Python dependency generator, which reads
upstream Python metadata and creates dependencies on these Provides.

However, Python extras are not yet handled by the Provides tags which
leads to imperfections and problems in declared dependencies.

=== Status quo ===

Currently in Fedora (before this change), no package provides
<code>python3.Xdist(foo[bar])</code> for the <code>foo[bar]</code>
Python extra. As a direct result of this, no package can require it.
The automatic RPM Python dist dependency generator only generates an
incomplete requirement on the base package
(<code>python3.Xdist(foo)</code>) in such cases.

The transitive extra dependencies were often needed to be hardcoded
manually. I.e. when <code>foo</code> requires <code>bar[baz]</code>,
package <code>bar</code> does not require the additional dependencies
for the <code>bar[baz]</code> extra. Thus <code>foo</code> needs to
hardcode those dependencies manually. For example:
[https://src.fedoraproject.org/rpms/poetry/c/97fa3d908]. This leads to
possibly missing, broken and/or outdated superfluous dependencies.

=== Extras metapackages ===

In this change proposal, we propose to solve the problem using
metapackages. The following metapackage represents the
<code>setuptools_scm[toml]</code> extra for the
<code>python3-setuptools_scm</code> RPM package
(<code>python-setuptools_scm</code> source package):

 %package -n python3-setuptools_scm+toml
 Summary: Metapackage for python3-setuptools_scm: toml extra
 Requires: python3-setuptools_scm = %{?epoch:%{epoch}:}%{version}-%{release}

 %description -n python3-setuptools_scm+toml
 This is a metapackage bringing in toml extra requires for
python3-setuptools_scm.
 It contains no code, just makes sure the dependencies are installed.

 %files -n python3-setuptools_scm+toml
 %ghost %{python3_sitelib}/*.egg-info

Notice several things:

* The package has a hard dependency on <code>python3-setuptools_scm =
%{?epoch:%{epoch}:}%{version}-%{release}</code>. While this could be
in theory generated by the dependency generator, the change owners
have decided not to do that to allow certain leeway for
experimentation. However, the dependency will created by the macro
helper below. Technically, <code>%{?_isa}</code> should also be used
for arched packages, but in practice we believe it can be omitted.

* The package contains no files except the <code>%ghost</code>
metadata. This is needed for the dependency generator to have access
to the upstream metadata of this package.

The [https://src.fedoraproject.org/rpms/python-rpm-generators/pull-request/19
updated RPM Python dist dependency generator] parses the extras name
from the subpackage name by splitting it on the <code>+</code> sign.
This naming scheme is not new, it is copied from Rust packaging. Five
Python packages in Fedora already use the same scheme for similar
metapackages representing Python extras. And normalized Python
distribution package names (or extras names) don't naturally contain
the <code>+</code> sign. (Neither do existing Fedora packages prefixed
with <code>python3-</code>, except the 5 components already
mentioned.)

The metapackage can have additional features if desired. For example:

* It can obsolete/provide other names (e.g. obsoleted extras packages)
* It can have manual strong or weak dependencies on other (possibly
non-Python) packages
* It can contain files excluded from the "base" package (if such files
only make sense with the extra and the base package does not fail
without them)

The "base" package (in this case <code>python3-setuptools_scm</code>)
can optionally Require/Recommend/Suggest a Python extras metapackage
if the packager deems it useful.

The change for the RPM Python dist dependency generator is prepared in:

* https://github.com/torsava/rpm/pull/2 (PR for upstream RPM will
follow after this change is discussed in Fedora)
* https://src.fedoraproject.org/rpms/python-rpm-generators/pull-request/19
(to be adapted based on feedback and merged in Fedora once the change
is approved)

=== Macro helper ===

For the most common case, the change owners have prepared a macro
helper in https://src.fedoraproject.org/rpms/python-rpm-macros/pull-request/59

To generate the example above, it should be used like this:

 %{?python_extras_subpkg:%python_extras_subpkg -n
python3-setuptools_scm -i %{python3_sitelib}/*.egg-info toml}

* The <code>%{?python_extras_subpkg:...}</code> way of using this
macro ensures the spec file remains valid for older Fedora/EL
releases, where this code will do nothing.
* The <code>-n</code> option specifies the name of the "base" package.
* The <code>-i</code> option specifies the <code>%files %ghost</code>
path (glob) to the the metadata directory (the <code>.dist-info</code>
or <code>.egg-info</code> directory)
* The one or more positional arguments specify the extra(s) name(s) —
multiple metapackages are generated when multiple names are provided.

Other possible arguments:
* The <code>-f</code> option (conflicts with <code>-i</code> and
<code>-F</code>) can specify the relative path to the filelist for
this metapackage (which should contain the <code>%files %ghost</code>
path (glob) to the the metadata directory). This API is prepared for
integration with <code>pyproject-rpm-macros</code>.
* The <code>-F</code> flag (conflicts with <code>-i</code> and
<code>-f</code>) can be used to skip the <code>%files</code> section
entirely (if the packager wants to construct it manually).

Note that this macro generates all the subpackage definition sections
(<code>%package</code> including the Summary and Requires on the base
package, <code>%description</code> and <code>%files</code>), and hence
it cannot be extended with custom Provides/Obsoletes/Requires/etc.
This macro is designed to fit the most common uses. It doesn't
currently cover all use cases. Packagers can, however, construct the
subpackage manually if they need custom features not covered by
<code>%python_extras_subpkg</code>. In the future, the API of the
macro can be extended if there is demand.

See the [https://src.fedoraproject.org/rpms/python-rpm-macros/pull-request/59
linked pull request] for example outputs.

Due to technical limitations, the macro helper never generates
requirement on the arched <code>BASE_PACKAGE%{?_isa} =
%{?epoch:%{epoch}:}%{version}-%{release}</code>. It only adds
<code>Requires: BASE_PACKAGE =
%{?epoch:%{epoch}:}%{version}-%{release}</code>) because a
[https://github.com/rpm-software-management/rpm/issues/689 macro
cannot reliably detect if the subpackage is arched or not]. The change
owners believe the resolver will do the right thing by default. If
there are problems with this approach, an additional flag (such as
<code>-a</code>) can be introduced to indicate an arched base package.

=== Why is there no automatic extras discovery? ===

[http://lists.rpm.org/pipermail/rpm-ecosystem/2020-February/000730.html
RPM is not capable of creating dynamic subpackages] based on the
content in <code>%{buildroot}</code> or on the unpacked sources
(<code>%{_builddir}</code>) yet.

Hence, we require the packager to manually list which Python extras
(if any) should be packaged as metapackages. Not all extras are useful
for us anyway, as there are often extras representing the
build/dev/doc/test dependencies of the project.

In the future (once/if RPM supports this), the generators can be
extended with auto-discovery of Python extras (with filtering).

=== Automatic provides generator ===

To continue with our example, the
<code>python3-setuptools_scm+toml</code> subpackage will Provide
<code>python3.Xdist(setuptools_scm[toml])</code> (and also
<code>python3dist(setuptools_scm[toml])</code>).

An attempt to package a nonexsiting extra (e.g.
<code>python3-setuptools_scm+nopenopenope</code>) will result in build
failure with an human-readable error message.

=== Automatic requires generator ===

If a Python package requires <code>setuptools_scm[toml]</code>, the
Fedora RPM package will require
<code>python3.Xdist(setuptools_scm[toml])</code> and also
<code>python3.Xdist(setuptools_scm)</code>. In theory, the second
requirement is redundant, but in practice, it makes it easier (and
less error prone) to query package dependencies in Fedora (e.g. using
<code>dnf repoquery</code>).

The packaged extras will also Require additional dependencies listed
in their Python metadata, in the case of
<code>python3-setuptools_scm+toml</code>, it will require
<code>python3.Xdist(toml)</code> (because on the Python level,
<code>setuptools_scm[toml]</code> requires <code>toml</code>).

Packagers can opt out from automatically generated dependencies on
Python extras by defining the <code>%_python_no_extras_requires</code>
macro to any value (usually <code>1</code>) in the spec file. This
should be only a a temporary measure until the missing extra is
packaged. If the upstream dependency information is not accurate,
please work with upstream to fix it.

=== Coordinated effort to avoid breakage ===

The change owners have
[https://copr.fedorainfracloud.org/coprs/g/python/python-extras/
collected data about non-modular packages in Copr]. Note that ~270
packages failed to build for unrelated reasons and hence we miss data
for them. However, ~3300 packages built successfully.

The following extras metapackages will be added to avoid broken dependencies:

 autobahn[twisted]
 cachecontrol[filecache]
 cairocffi[xcb]
 cli-helpers[styles]
 docker[ssh]
 fonttools[ufo]
 fonttools[unicode]
 ipython[notebook]
 lunr[languages]
 oauthlib[signedtoken]
 pyjwt[crypto]
 raven[flask]
 requests[security]
 requests[socks]
 tabulate[widechars]
 twisted[tls]
 vistir[spinner]

The following components will be modified:

 python-autobahn
 python-CacheControl
 python-cairocffi
 python-cli-helpers
 python-docker
 fonttools
 ipython
 python-lunr
 python-oauthlib
 python-jwt
 python-raven
 python-requests
 python-tabulate
 python-twisted
 python-vistir

When we added the metapackages for these extras in our testing Copr,
no new broken requires on Python extras were generated. In other
words, these new extras subpackages don't require adding any more
extras subpackages. No extras are required by the remaining Python 2
packages in Fedora.

Once the change in the dependency generator is deployed in rawhide,
the change owners will monitor all newly added requires on missing
extras and will add new metapackages as needed.

5 source packages in Fedora already have Python extras
meta-subpackages with the proposed naming pattern, but they don't have
any listed <code>%files</code>. They will be non-intrusively adapted
via pull requests — by adding the <code>%ghost</code> file entry to
the metapackage(s). Maintainers can then decide whether to opt for
simpler rawhide only specfile with <code>%python_extras_subpkg</code>
or to maintain the current compatibility. This concerns the following
18 subpackages:

 python3-dask+{array,bag,dataframe,delayed}
 python3-django-storages+{azure,boto,boto3,dropbox,libcloud,sftp}
 python3-dns-lexicon+{easyname,gratisdns,henet,hetzner,plesk,route53}
 python3-drf-yasg+validation
 python3-prometheus_client+twisted

==== Modular packages ====

The change owners are only cable of monitoring and adapting
non-modular packages. Due to long standing issues, we are unable to
inspect, query (or do a targeted rebuild of) modular content:

* https://pagure.io/modularity/issue/160
* https://pagure.io/modularity/issue/163
* https://pagure.io/modularity/issue/165

If there are people available to help with this problem, the change
owners will gladly accept their help, we are not excluding modular
content because we would like to do it, but because we don't know how
to work with it at scale.

=== How to add Python extras subpackage to my package? ===

In this section, we'll describe a step-by-step guide of adding the
Python extras subpackage to your package. Imagine you maintain
<code>python-requests</code> and a maintainer of a dependent package
contacts you: "I would like you to add a subpackage for
<code>requests[security]</code>, because my package requires it."

# Locate the <code>%files</code> section for
<code>python3-requests</code> package in
<code>python-requests.spec</code>.
# Find the entry for <code>.egg-info</code> or <code>.dist-info</code>
metadata directory. If the entry is generalized with globs like
<code>%{python3_sitelib}/*</code>, please make the <code>%files</code>
 section more explicit while at it. Copy the line with the metadata
directory. In this guide we assume it is
<code>%{python3_sitelib}/*.egg-info</code>.
# Locate the <code>%description</code> of the
<code>python3-requests</code> package.
# After the description, add:
<code>%{?python_extras_subpkg:%python_extras_subpkg -n
python3-requests -i %{python3_sitelib}/*.egg-info security}</code> on
a separate line.
# Build the package (e.g. in local mock).
# Verify the <code>python3-requests+security</code> package is build
and provides <code>python3dist(requests[security])</code>.
# See if the new extras package doesn't have dependencies on packages
missing from Fedora (extras or "basic") and proceed with adding those
if needed.
# Ship the change in Fedora 33+. It should do nothing in Fedora 31/32
or current EPELs.

=== Packaging guidelines ===

The change owners will describe this concept in the Python packaging
guidelines and will propose the following rules for the Fedora
Packaging Committee to approve:

* Packagers MAY add Python extras metapackages as needed.
* The Python extras metapackages MUST require the base package (exact NEVR).
* Packagers MAY add strong or weak dependencies on the extras
metapackages from the base package as they see fit.
* Packagers SHOULD NOT add Python extras metapackages with
dependencies only useful for maintaining the package (usually extras
called dev/test/doc/build/...).
** Optional: Packagers MAY package tests separately into the
<code>[test]</code> or <code>[testing]</code> extras subpackage.
* If a Fedora package requires a Python extra of a different package,
the extras metapackage MUST be added to that package to avoid broken
dependencies.
* Packagers MAY temporarily disable the automatic requires on extras
subpackages (by defining <code>%_python_no_extras_requires</code>)
until the missing metapackage is introduced, but they SHOULD notify
the maintainer of the package they depend on about the situation.
* If upstream drops an extra, even though it is discouraged by
upstream documentation
([https://setuptools.readthedocs.io/en/latest/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies
see final paragraph]), the metapackage SHOULD be Obsoleted from the
base package or, if there is continuity, from another extras
metapackage.
* If the upstream Python package name contains <code>+</code>, it MUST
be replaced with <code>-</code> in package names (in accordance with
the upstream [https://www.python.org/dev/peps/pep-0503/#normalized-names
Python package names normalization]).

== Feedback ==
This has been briefly discussed in general terms
[https://github.com/rpm-software-management/rpm/issues/1061 upstream].
People tend to agree that some solution is needed. The concrete
proposal contained in this Fedora Change is based on the discussion,
but has received no feedback yet.

More feedback will be documented here once the change proposal is
announced and discussed in Fedora.

== Benefit to Fedora ==
* Packages will have more accurate automatic dependencies, and the
hard-to-maintain and error prone manual transitive (and other)
dependencies can be dropped.
* There will be less missing and redundant dependencies.
* Python packagers will have less manual dependencies to worry about
and less problems to workaround.
* The handling of Python extras will be standardized.
* Overall, the Python ecosystem in Fedora will be closer to upstream.

== Scope ==
* Proposal owners:
** Polish and merge the code changes for
<code>python-rpm-generators</code> and <code>python-rpm-macros</code>
linked above.
** Add the 17 missing extras metapackages listed in this change to
avoid broken dependencies (using pull requests or provenpackager
powers if need be).
** Adapt the 5 existing Python extras subpackages listed in this
change to work with the dependency generator (using pull requests, or
provenpackager powers if need be).
** Monitor new dependencies on Python extras subpackages, add extras
subpackages where needed (using pull requests, or provenpackager
powers if need be).
** Propose the updated Python packaging guidelines to FPC for approval.
** Provide help and guidance for packagers.
** Optional: Prepare <code>pyproject-rpm-macros</code> integration of
this change.

* Other developers:
** No immediate action necessary.
** They can opt in for more metapackages with extras.
** They can review and merge pull requests.
** They should follow the updated Python packaging guidelines if the
changes are approved by FPC.
* Release engineering: No releng impact anticipated. The new
dependencies will be primarily generated by the mass rebuild, but if
the mass rebuild is missed, the package maintainers or change owners
can rebuild the packages that will gain the new automatic Requires is
on Python extras.
* Policies and guidelines: Yes, see detailed description.
* Trademark approval: Not needed for this Change.


== Upgrade/compatibility impact ==
No impact anticipated.

== How To Test ==
Check that there are packages that Require
<code>python3.9dist(basename[extrasname])</code>. You can use the
following repoquery:

 dnf repoquery --repo=rawhide --whatrequires 'python3.9dist(*\[*\])'

Check that there are Python extras metapackages with the correct
Provides, for example by installing the packages returned by the above
query, or manually via queries like:

 dnf repoquery --repo=rawhide --whatprovides
'python3.9dist(requests\[security\])'

To query all existing Python extras metapackages, you can use:

 dnf repoquery --repo=rawhide --provides -a | grep -E
'python(3\.9|2\.7)dist\(\S+\[\S+\]\)'

And lastly, to query all required Python extras metapackages:

 dnf repoquery --repo=rawhide --requires -a | grep -E
'python(3\.9|2\.7)dist\(\S+\[\S+\]\)'

== User Experience ==
When installing Python RPM packages, the dependencies are more likely
to fulfill user expectations, as they will more closely adhere to the
behavior of pip (the Python package installer).

== Dependencies ==
Nothing.

== Contingency Plan ==
* Contingency mechanism: (What to do?  Who will do it?)
** Soft: The change owners will disable the requirements generator by
default and rebuild (or untag if FTBFS) packages with broken
dependencies caused by the change.
** Hard: The change owners will revert everything and rebuild (or
untag if FTBFS) packages with new requirements/provides caused by the
change.
* Contingency deadline: Beta freeze
* Blocks release? No

== Documentation ==
The packaging guidelines will be the documentation if approved. If
not, this Fedora Change shall serve as the documentation.



-- 
Ben Cotton
He / Him / His
Senior Program Manager, Fedora & CentOS Stream
Red Hat
TZ=America/Indiana/Indianapolis
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux