Re: Effective license analysis: required or not?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 21, 2023 at 04:25:22PM +0200, Florian Weimer wrote:
> * Daniel P. Berrangé:
> 
> > On Mon, Aug 21, 2023 at 01:04:29PM +0200, Florian Weimer wrote:
> >> I think Richard said that he would start a thread like this, but it
> >> hasn't happened, so I feel like should get this off my chest now.
> >> 
> >> <https://docs.fedoraproject.org/en-US/legal/license-field/#_no_effective_license_analysis>
> >> starts with this:
> >> 
> >> | No “effective license” analysis
> >> |
> >> | The License: field is meant to provide a simple enumeration of the
> >> | licenses found in the source code that are reflected in the binary
> >> | package. No further analysis should be done regarding what the
> >> | "effective" license is, such as analysis based on theories of GPL
> >> | interpretation or license compatibility or suppositions that
> >> | “top-level” license files somehow negate different licenses appearing
> >> | on individual source files.
> >> 
> >> This is contradictory.  I think there are two aspects here:
> >> 
> >> * Determine possible licenses that end up in the binary package.
> >> 
> >> * Perform algebraic simplifications on the license list.
> >> 
> >> Both analyses are forms of effective licensing analysis.  Of course, you
> >> cannot derive an SPDX identifier without doing any analysis.  However, I
> >> strongly believe that the first approach (determining the binary package
> >> license) is itself a form of effective licensing analysis, and similar
> >> reasons for package maintainers not doing this applies.  The derived
> >> SPDX identifier will reflect both the package source code and what went
> >> into the build system.
> >
> > It could perhaps be worded better, but I don't see this as contradictory,
> > it is just a matter of what you consider "effective analysis" to refer to.
> > The last sentance expands on this to say that 'effective' in this context
> > is refering to the analysis of license compatibility that Fedora previously
> > recommended.
> 
> I think it goes beyond terminology.  I think determining the binary RPM
> licenses has similar complexities than the license algebra.  I can't
> imagine consensus emerging around that.  There's just no firm reasoning
> why we ignore header files and dynamic linking in the License: tag, the
> glibc startup code, but not static linking in general.  I think coming
> up with a consistent rules is even more complicated than some sort of
> license algebra, or rules for ignoring certain copyright files.  So I
> think the perceived simplification of the rules fell short, and the
> present rules are still unworkable.

WRT header file / glibc startup / static linking licenses being
ignored, the rationale I would express is that those pieces must
(by implication) all already be license compatible (in some way)
with the package consuming. This is admittedly though another
case of "effective license" doctrine, albeit an implicit one,
rather than explicit by the maintainer / package reviewer.


> > What I'm not a fan on with this approach is that it would cause us
> > to include licenses that are clearly irrelevant for Fedora binary
> > packages. If we consider the "license" tag to be something for end
> > users to look at, I think this will be misleading.
> 
> We can come up with something that looks at the state of the tree after
> %prep, or something like that.
> 
> The problem with dropping stuff arbitrarily is that it makes it again
> impossible to rely on tooling.

IMHO no matter what we do, the value of the License field is rather
limited for semantic interpretation by automated tooling, because
it is reducing a very complexity situation down to a very crude
expression. 

It is notable that both Debian "copyright" file format and the REUSE
format both provide a massively more granular expression of package
licensing, targetted at machine processing.

Although our new SPDX expressions are better for machine readability
than in the past, we should be explicit about the limitations of our
data and problems with attempting todo any semantic analysis based
off it.

> > For example in one package I reviewed there is kernel code that is
> > only built on Solaris which is under the CDDL. Including that in
> > the Fedora binary RPM license feels totally wrong.
> 
> I disagree.  Upstream may have copied code from the CDDL part of the
> tree to other parts without updating the license.  If we ignore the CDDL
> license, we say that hasn't happened, and I doubt we are in the position
> to make such a certification for most packages.

> Of course someone may have copied code from a Stackoverflow answer
> (which is generally available under incompatible license terms), and we
> wouldn't know about that either.  But suppressing license information
> actually present in the source package (although in a supposedly unused
> location) seems different.
> 

I don't think it is different. Both are a case of garbage-in == garbage-out.

If upstream copied CDDL code into a file and didn't record this in the
file's stated license, then that's a problem whether the original CDDL
code is part of the same project or from stack overflow. In both cases
upstream made a mistake and failed to record accurate license info in
the source file.

We're not making any judgement or statement about the accuracy of
upstream's licensing record. We're summarizing what upstream has
presented in its source files and taking that on faith (unless someone
happens to notice some blatent inaccuracy).

This feels like a case where we should better document what our
input assumptions are with License tag data.

Debian copyright files and REUSE data will suffer the same limitation
as they're both promoting a view that license information is trackable
and analysable per file, so if upstream fails to record a license the
copyright/REUSE files will similarly be inaccurate.

> > I might suggest adding an extra sentance to make it more explicit
> > that the binary RPM license is not a perfect representation of
> > the binary content, as may sometimes include extra licenses from
> > source files that were not relevant. This would reflect the somewhat
> > pragmatic approach that I think maintainers already take in practice
> 
> I would welcome that.  And update the Rust guidelines accordingly, to
> clarift that the kind of buildroot-to-binary-RPM propagation that the
> tooling performs is optional and not required by (the spirit of) the
> Fedora guidlines.

I agree with your general point that we've not adequately documented
many of the assumptions / simplications that maintainers will / should
take when analysing license data in source files. Probably the various
scenarios you've illustrated should be answered in some way in the
licensing pages.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
_______________________________________________
legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Gnome Users]     [KDE Users]

  Powered by Linux