On Mon, Aug 21, 2023 at 9:30 AM Daniel P. Berrangé <berrange@xxxxxxxxxx> wrote:
On Mon, Aug 21, 2023 at 01:04:29PM +0200, Florian Weimer wrote:
> I think Richard said that he would start a thread like this, but it
> hasn't happened, so I feel like should get this off my chest now.
>
> <https://docs.fedoraproject.org/en-US/legal/license-field/#_no_effective_license_analysis>
> starts with this:
>
> | No “effective license” analysis
> |
> | The License: field is meant to provide a simple enumeration of the
> | licenses found in the source code that are reflected in the binary
> | package. No further analysis should be done regarding what the
> | "effective" license is, such as analysis based on theories of GPL
> | interpretation or license compatibility or suppositions that
> | “top-level” license files somehow negate different licenses appearing
> | on individual source files.
>
> This is contradictory. I think there are two aspects here:
>
> * Determine possible licenses that end up in the binary package.
>
> * Perform algebraic simplifications on the license list.
>
> Both analyses are forms of effective licensing analysis. Of course, you
> cannot derive an SPDX identifier without doing any analysis. However, I
> strongly believe that the first approach (determining the binary package
> license) is itself a form of effective licensing analysis, and similar
> reasons for package maintainers not doing this applies. The derived
> SPDX identifier will reflect both the package source code and what went
> into the build system.
It could perhaps be worded better, but I don't see this as contradictory,
it is just a matter of what you consider "effective analysis" to refer to.
The last sentance expands on this to say that 'effective' in this context
is refering to the analysis of license compatibility that Fedora previously
recommended.
The 'license compatibility' aspect is what is meant by the process here. Previously Fedora carried a large license compatibility chart and there were rules (both written and passed down through oral tradition) about what is and is not compatible with the different GPL licenses. A common case was a GPL project incorporating BSD licensed code which meant the entire collective work was GPL licensed _per FSF guidelines_. This is the part that Fedora Legal wants to do away with. Package maintainers do not need to perform this compatibility analysis.
So yes, it could probably be worded better.
The analysis maintainers are being asked to do today is not about interpreting
licensing. They "merely" being asked to determine what source files are
containing code that becomes part of the resulting binary RPM. This is more
build system analysis than license analysis, and distinct from what Fedora
would traditionally describe as "effective license analysis".
Correct.
> * Some package maintainers, when translating to SPDX, merely translate
> the existing License: line as best as they can, without looking at the
> actual sources or produced binaries.
This I think is probably the main flaw in the process we asked our
maintainers to follow.
At a high level we portrayed the whole exercise as merely a terminology
change, but it was not.
Given the removal of the effective license analysis requirement, that
was / is an over simplification.
Strictly speaking I think the exercise ought to have been portrayed as
more of a license (re-)audit. In the general case maintainers ought to
be redoing the license audit part of the new package review process,
for all existing packages, not blindly converting existing terminology.
We did portray this as a re-audit and not simply a change in abbreviations. I did many presentations on just that and we held numerous hack fests where we helped people analyze packages to determine the correct license _expression_ in SPDX.
There are definitely maintainers who thought it was just a change in abbreviations and I do not think there is an easy way to stop that. But as we have been going through this process, the number of maintainers auditing packages has been high and we are seeing more licenses captured and added to SPDX that were not previously represented.
What I would like to see in the future is a tool connected to Bodhi that can run for each build of a package and perform a license analysis, build the License tag string, then compare it to what is in the spec file. If it's different then alert the package maintainer and say "Hey, I'm a script and I analyzed the licenses in this package and I *think* I found something different than what you put in the spec file License tag. If I were you, I would check it out because one of us is probably wrong." Maybe we'll have that at some point.
> In the light of this, I would like to suggest updating the guidelines in
> the following way:
>
> The License: line should be based on the sources only. Using a tool
> such as Fossology to discover relevant licenses and their SPDX tags is
> sufficient. No analysis how licenses from package source code or the
> build environment propagate into binary RPMs should be performed.
> Individual SPDX identifiers that a tool has listed should be separated
> by AND. Package maintainers are encouraged to re-run license analysis
> tooling on the source code as part of major package rebases, and
> update the License: tag accordingly.
>
> To me, that seems to be much more manageable.
What I'm not a fan on with this approach is that it would cause us
to include licenses that are clearly irrelevant for Fedora binary
packages. If we consider the "license" tag to be something for end
users to look at, I think this will be misleading.
For example in one package I reviewed there is kernel code that is
only built on Solaris which is under the CDDL. Including that in
the Fedora binary RPM license feels totally wrong.
In many packages using autotools there are snippets of m4 code that
are under a variety of licenses, again not affecting the output.
Those would "bloat" the license tag for little obvious gain.
I do agree though that doing *perfect* build system analysis to figure
out what source files become part of the binary RPMs is impractical
for any non-trivial packages.
My approach has been to scan the source for licenses, and then look
at source files with any licenses I was surprised to see. Often it
is possible to exclude these unexpected licenses, because they are
obviously part of the build system, or are obviously for a differnt
OS platform.
I agree here. To me, the source archive already includes its licensing information. What Fedora does is build this source in a curated way for distribution, so the licenses that apply to our build do not necessarily align with other distributions. This is the licensing information that I think is relevant for Fedora to understand and convey to users. There will always be additional licenses in the source archives (given your examples above such as autotools or optional code that is not enabled at build time), but the source archives include all of that licensing information. If they didn't, we would not be able to build anything from that source to include in Fedora.
I would describe this as trying to meet the spirit of the having the
RPM license reflect binary content, while acknowledging the reality
that maintainers won't fully analyse the build system as it is too
time consuming & impractical.
I might suggest adding an extra sentance to make it more explicit
that the binary RPM license is not a perfect representation of
the binary content, as may sometimes include extra licenses from
source files that were not relevant. This would reflect the somewhat
pragmatic approach that I think maintainers already take in practice
One thing I have learned through this whole SPDX project is that being a convergence of something technical and something not makes it very difficult to arrive at what we think is an acceptable completion. I think what we have now in Fedora is pretty good compared to other distributions, but it can always improve. We just keep making improvements and keep refining the process. Shaving the yak, as it were.
Thanks,
| The License: field is meant to provide a simple enumeration of the
| licenses found in the source code that are reflected in the binary
| package. In may also include additional licenses for files that are
| not part of the binary where it is impractical to filter them out
| during license review. No further analysis should be done regarding
| what the "effective" license is, such as analysis based on theories
| of GPL interpretation or license compatibility or suppositions that
| “top-level” license files somehow negate different licenses appearing
| on individual source files.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
_______________________________________________
legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
--
David Cantrell <dcantrell@xxxxxxxxxx>
_______________________________________________ legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue