Re: Effective license analysis: required or not?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Dne 24. 08. 23 v 20:15 Richard Fontana napsal(a):
On Mon, Aug 21, 2023 at 7:04 AM Florian Weimer <fweimer@xxxxxxxxxx> wrote:
I think Richard said that he would start a thread like this, but it
hasn't happened, so I feel like should get this off my chest now.
<https://docs.fedoraproject.org/en-US/legal/license-field/#_no_effective_license_analysis>
starts with this:

| No “effective license” analysis
|
| The License: field is meant to provide a simple enumeration of the
| licenses found in the source code that are reflected in the binary
| package. No further analysis should be done regarding what the
| "effective" license is, such as analysis based on theories of GPL
| interpretation or license compatibility or suppositions that
| “top-level” license files somehow negate different licenses appearing
| on individual source files.

This is contradictory.  I think there are two aspects here:

* Determine possible licenses that end up in the binary package.

* Perform algebraic simplifications on the license list.

Both analyses are forms of effective licensing analysis.  Of course, you
cannot derive an SPDX identifier without doing any analysis.  However, I
strongly believe that the first approach (determining the binary package
license) is itself a form of effective licensing analysis, and similar
reasons for package maintainers not doing this applies.  The derived
SPDX identifier will reflect both the package source code and what went
into the build system.
We were using "effective license" somewhat more narrowly, referring to
how that phrase was used in some of the legacy Fedora documentation as
well as how it is used sometimes in non-Fedora FLOSS-legal contexts. I
am certain the phrase was not invented by Fedora but somehow it crept
into FLOSS legal commentary about 10 or so years ago and I wasn't even
aware it was used in Fedora documentation until last year. It
partially embodies (usually in a highly distorted way) a much older
set of folk-understandings of the operation of the *GPL license family
in particular but is often used more generally. It may have some
connection to what SPDX calls the "concluded license" (which is
contrasted with the "declared license") but to be honest I am not sure
what those concepts mean.

It's true that in a less specific way we are doing lots of "effective
license analysis", for example anytime I have said that something is
"not a license" despite the license text appearing in some source
code.

Below, I'm collecting a list of observations of what I believe is the
current approach in this area, as taken by package maintainers carrying
out the SPDX conversion.  To me, it strongly suggest that the SPDX
identifiers we derive today do not accurately reflect binary RPM package
licensing, even when lots of package maintainers put in the extra effort
to determine binary package licenses.

* Most package maintainers probably assume that License: tags on all
   built RPMs (source RPMs and binary RPMs) should reflect binary package
   contents, at least when all subpackages are considered in aggregate.
   Often, Source RPMs contain the same License: line as binary RPMs.
This is the most important issue I was hoping to raise, if we mean the
same thing.

We (Jilayne and I and others who worked on the new Fedora
license-related docs) did not invent this concept. The old Fedora
documentation had a "license of the binary" policy, I assume developed
mainly by Tom Callaway, that I always thought was a great analytical
or representational advance. Here's what the old Fedora docs said:

The oldest archived version of
http://fedoraproject.org/wiki/Packaging:LicensingGuidelines(dated from
2008) says "The License: field refers to the licenses of the contents
of the *binary* rpm." The author was clearly at pains to make clear
that it was not meant to encompass the entirety of the source code as
packaged in source RPMs.

At least since 2009, this was followed by: "If a source package
generates multiple binary packages, the License: field may differ
between them if necessary. This implies that a single spec may have
multiple per-subpackage License: tags. Each of those License: tags
must comply with all applicable guidelines."

I thought I understood what that meant and I thought I saw examples of
that in operation. Recently, I've started to wonder whether I
misunderstood that all along, though I don't see how. The text seems
very clear to me.

When I look randomly at spec files of Fedora packages, I begin to
suspect that most Fedora package maintainers must have always ignored
this directive and have continued to ignore it after the rule was
recast in the post-July-2022 docs. In *most* cases of packages other
than possibly those coming from ecosystems or historical contexts
featuring highly uncomplicated licensing structures, there will be
some differences in the makeup of binary packages from a built source
code licensing standpoint. I only rarely see attempts to reflect this
via multiple License: fields. While in the scheme of things I only
look at a small sample of Fedora packages I suspect they are
representative.

I can conclude one of two things:
1. The license of the binary rule is too hard for most Fedora package
maintainers to comply with.
2. Fedora package maintainers are unaware of the rule and are
substituting their own intuition, which I think must be something like
"each RPM should have one License: field that reflects the makeup of
all the binary RPMs without attempting to distinguish among them".

BTW I don't think #1 is "The license of the binary rule is too hard
for most Fedora package maintainers to comply with *without the
application of effective licensing folkloric concepts". Because even
when "effective licensing" was assumed by some Fedora package
maintainers to be legitimate (even though it was never consistently
endorsed in Fedora legal/packaging rules) it must be the case that
most Fedora package maintainers were still ignoring the rule.


We try to follow this guideline in e.g. Ruby:

https://src.fedoraproject.org/rpms/ruby/blob/rawhide/f/ruby.spec

However, I am afraid we completely fail for all rubygem-*-doc subpackages, where the largest amount of payload is typically generated documentation, which contains bundled fonts and libraries. Trying to address this is non-trivial (mainly, we don't really want to diverge from upstream, but other users consuming similar content via `gem` command don't care too much). I am trying to find reasonable solution (and closing my eyes) for years. At least keep it with the spirit of how the upstream works:

https://bugzilla.redhat.com/show_bug.cgi?id=1224715

https://lists.fedoraproject.org/archives/list/ruby-sig@xxxxxxxxxxxxxxxxxxxxxxx/thread/5DIFYBGRQK6POZSUCWFLPMZFJRP34YSS/




This puzzles and disappoints me since, as I have said, the license of
the binary concept was in my view a major advance in the way people
were thinking about appropriate ways of representing licenses of
packages.


Even if the situation is not perfect (or maybe it is even worse ;) ), I still think it is the right thing to try to do.


Vít


  If you look into SPDX, for example, SPDX doesn't even have
(as far as I can tell) a sophisticated way of distinguishing between
binary and source licensing. I believe this reflects the source
code-centric and non-packaging-centric world view of many of the
people who got involved with SPDX early on, but that may be unfair.

When we (a bunch of us inside Red Hat that is) started to think about
revamping the rules on RPM license metadata, we thought about a number
of options. One thing I should note is that my enthusiasm for a
"license of the binary" rule was never really shared by anyone else I
talked to at Red Hat (though I think this is partly because those who
I discussed it with came from those "source code centric" backgrounds
wrt open source license compliance and such). Anyway, we considered
switching to a "license of the source" rule, sort of like how I think
Petr Pisar is choosing to use the Source-License: field. We also
considered a more complex sort of "license of the binary" rule that
would attempt to do what I thought of as orthodox GPL-style analysis
on the components of binary RPMs (so that a binary RPM might have
"License: GPL-2.0-or-later AND GPL-2.0-or-later") but this was
rejected as unnecessarily complicated. We ended up with the "simple
enumeration of the licenses of the binary" rule which is in the
current Fedora docs, which I think of as a restatement of the 2009 (or
earlier) "license of the binary" rule. This was also discussed on this
list prior to incorporation into the present-day legal docs.

I'm deliberately ignoring most of the rest of your comments in this
message because I think they raise some additional topics, because I
want to make sure there is some focus on this one. What do we do about
the "license of the binary" rule? If it is really too hard to comply
with, I think we can only conclude that it has to be replaced with
some other approach. Since I'm not a Fedora package maintainer I do
not have good intuition for what's too hard vs. what's merely annoying
or cumbersome. I know why I find it challenging to figure out what
source files map to a given binary RPM, but I don't really directly
understand why this is hard for a Fedora package maintainer who is
theoretically highly familiar with the code they are packaging and
theoretically has some expertise in the language(s) and build tools at
issue. I just see the evidence suggesting that it is.

In the light of this, I would like to suggest updating the guidelines in
the following way:

   The License: line should be based on the sources only.  Using a tool
   such as Fossology to discover relevant licenses and their SPDX tags is
   sufficient.  No analysis how licenses from package source code or the
   build environment propagate into binary RPMs should be performed.
   Individual SPDX identifiers that a tool has listed should be separated
   by AND.  Package maintainers are encouraged to re-run license analysis
   tooling on the source code as part of major package rebases, and
   update the License: tag accordingly.
This seems to be close to what is *really* happening today, except
that there are categories of things that package maintainers know they
can exclude as a matter of convention.

Richard
_______________________________________________
legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue

[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Gnome Users]     [KDE Users]

  Powered by Linux