Dne 24. 08. 23 v 20:15 Richard Fontana napsal(a):
On Mon, Aug 21, 2023 at 7:04 AM Florian Weimer <fweimer@xxxxxxxxxx> wrote:I think Richard said that he would start a thread like this, but it hasn't happened, so I feel like should get this off my chest now. <https://docs.fedoraproject.org/en-US/legal/license-field/#_no_effective_license_analysis> starts with this: | No “effective license” analysis | | The License: field is meant to provide a simple enumeration of the | licenses found in the source code that are reflected in the binary | package. No further analysis should be done regarding what the | "effective" license is, such as analysis based on theories of GPL | interpretation or license compatibility or suppositions that | “top-level” license files somehow negate different licenses appearing | on individual source files. This is contradictory. I think there are two aspects here: * Determine possible licenses that end up in the binary package. * Perform algebraic simplifications on the license list. Both analyses are forms of effective licensing analysis. Of course, you cannot derive an SPDX identifier without doing any analysis. However, I strongly believe that the first approach (determining the binary package license) is itself a form of effective licensing analysis, and similar reasons for package maintainers not doing this applies. The derived SPDX identifier will reflect both the package source code and what went into the build system.We were using "effective license" somewhat more narrowly, referring to how that phrase was used in some of the legacy Fedora documentation as well as how it is used sometimes in non-Fedora FLOSS-legal contexts. I am certain the phrase was not invented by Fedora but somehow it crept into FLOSS legal commentary about 10 or so years ago and I wasn't even aware it was used in Fedora documentation until last year. It partially embodies (usually in a highly distorted way) a much older set of folk-understandings of the operation of the *GPL license family in particular but is often used more generally. It may have some connection to what SPDX calls the "concluded license" (which is contrasted with the "declared license") but to be honest I am not sure what those concepts mean. It's true that in a less specific way we are doing lots of "effective license analysis", for example anytime I have said that something is "not a license" despite the license text appearing in some source code.Below, I'm collecting a list of observations of what I believe is the current approach in this area, as taken by package maintainers carrying out the SPDX conversion. To me, it strongly suggest that the SPDX identifiers we derive today do not accurately reflect binary RPM package licensing, even when lots of package maintainers put in the extra effort to determine binary package licenses. * Most package maintainers probably assume that License: tags on all built RPMs (source RPMs and binary RPMs) should reflect binary package contents, at least when all subpackages are considered in aggregate. Often, Source RPMs contain the same License: line as binary RPMs.This is the most important issue I was hoping to raise, if we mean the same thing. We (Jilayne and I and others who worked on the new Fedora license-related docs) did not invent this concept. The old Fedora documentation had a "license of the binary" policy, I assume developed mainly by Tom Callaway, that I always thought was a great analytical or representational advance. Here's what the old Fedora docs said: The oldest archived version of http://fedoraproject.org/wiki/Packaging:LicensingGuidelines(dated from 2008) says "The License: field refers to the licenses of the contents of the *binary* rpm." The author was clearly at pains to make clear that it was not meant to encompass the entirety of the source code as packaged in source RPMs. At least since 2009, this was followed by: "If a source package generates multiple binary packages, the License: field may differ between them if necessary. This implies that a single spec may have multiple per-subpackage License: tags. Each of those License: tags must comply with all applicable guidelines." I thought I understood what that meant and I thought I saw examples of that in operation. Recently, I've started to wonder whether I misunderstood that all along, though I don't see how. The text seems very clear to me. When I look randomly at spec files of Fedora packages, I begin to suspect that most Fedora package maintainers must have always ignored this directive and have continued to ignore it after the rule was recast in the post-July-2022 docs. In *most* cases of packages other than possibly those coming from ecosystems or historical contexts featuring highly uncomplicated licensing structures, there will be some differences in the makeup of binary packages from a built source code licensing standpoint. I only rarely see attempts to reflect this via multiple License: fields. While in the scheme of things I only look at a small sample of Fedora packages I suspect they are representative. I can conclude one of two things: 1. The license of the binary rule is too hard for most Fedora package maintainers to comply with. 2. Fedora package maintainers are unaware of the rule and are substituting their own intuition, which I think must be something like "each RPM should have one License: field that reflects the makeup of all the binary RPMs without attempting to distinguish among them". BTW I don't think #1 is "The license of the binary rule is too hard for most Fedora package maintainers to comply with *without the application of effective licensing folkloric concepts". Because even when "effective licensing" was assumed by some Fedora package maintainers to be legitimate (even though it was never consistently endorsed in Fedora legal/packaging rules) it must be the case that most Fedora package maintainers were still ignoring the rule.
We try to follow this guideline in e.g. Ruby: https://src.fedoraproject.org/rpms/ruby/blob/rawhide/f/ruby.specHowever, I am afraid we completely fail for all rubygem-*-doc subpackages, where the largest amount of payload is typically generated documentation, which contains bundled fonts and libraries. Trying to address this is non-trivial (mainly, we don't really want to diverge from upstream, but other users consuming similar content via `gem` command don't care too much). I am trying to find reasonable solution (and closing my eyes) for years. At least keep it with the spirit of how the upstream works:
https://bugzilla.redhat.com/show_bug.cgi?id=1224715 https://lists.fedoraproject.org/archives/list/ruby-sig@xxxxxxxxxxxxxxxxxxxxxxx/thread/5DIFYBGRQK6POZSUCWFLPMZFJRP34YSS/
This puzzles and disappoints me since, as I have said, the license of the binary concept was in my view a major advance in the way people were thinking about appropriate ways of representing licenses of packages.
Even if the situation is not perfect (or maybe it is even worse ;) ), I still think it is the right thing to try to do.
Vít
If you look into SPDX, for example, SPDX doesn't even have (as far as I can tell) a sophisticated way of distinguishing between binary and source licensing. I believe this reflects the source code-centric and non-packaging-centric world view of many of the people who got involved with SPDX early on, but that may be unfair. When we (a bunch of us inside Red Hat that is) started to think about revamping the rules on RPM license metadata, we thought about a number of options. One thing I should note is that my enthusiasm for a "license of the binary" rule was never really shared by anyone else I talked to at Red Hat (though I think this is partly because those who I discussed it with came from those "source code centric" backgrounds wrt open source license compliance and such). Anyway, we considered switching to a "license of the source" rule, sort of like how I think Petr Pisar is choosing to use the Source-License: field. We also considered a more complex sort of "license of the binary" rule that would attempt to do what I thought of as orthodox GPL-style analysis on the components of binary RPMs (so that a binary RPM might have "License: GPL-2.0-or-later AND GPL-2.0-or-later") but this was rejected as unnecessarily complicated. We ended up with the "simple enumeration of the licenses of the binary" rule which is in the current Fedora docs, which I think of as a restatement of the 2009 (or earlier) "license of the binary" rule. This was also discussed on this list prior to incorporation into the present-day legal docs. I'm deliberately ignoring most of the rest of your comments in this message because I think they raise some additional topics, because I want to make sure there is some focus on this one. What do we do about the "license of the binary" rule? If it is really too hard to comply with, I think we can only conclude that it has to be replaced with some other approach. Since I'm not a Fedora package maintainer I do not have good intuition for what's too hard vs. what's merely annoying or cumbersome. I know why I find it challenging to figure out what source files map to a given binary RPM, but I don't really directly understand why this is hard for a Fedora package maintainer who is theoretically highly familiar with the code they are packaging and theoretically has some expertise in the language(s) and build tools at issue. I just see the evidence suggesting that it is.In the light of this, I would like to suggest updating the guidelines in the following way: The License: line should be based on the sources only. Using a tool such as Fossology to discover relevant licenses and their SPDX tags is sufficient. No analysis how licenses from package source code or the build environment propagate into binary RPMs should be performed. Individual SPDX identifiers that a tool has listed should be separated by AND. Package maintainers are encouraged to re-run license analysis tooling on the source code as part of major package rebases, and update the License: tag accordingly.This seems to be close to what is *really* happening today, except that there are categories of things that package maintainers know they can exclude as a matter of convention. Richard _______________________________________________ legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue