Re: SPDX Statistics - R.U.R. edition

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Sep 17, 2023 at 11:37 AM Mark Wielaard <mark@xxxxxxxxx> wrote:
>
> To be clear I don't mind using a different set of short-hands in the
> License tags. Although it feels a little odd to try to create separate
> identifiers for lax-permissive MIT/BSD like licenses which sometimes
> just different in one or two words.

FWIW, usually a difference of one or two words wouldn't be enough to
result in creation of a distinct SPDX identifier. The standard applied
by SPDX is, informally, whether the difference is "legally
substantive" (this has its flaws but seems to work OK in practice).

I think anyone should be free to propose a new umbrella identifier (in
SPDX expression format) that would cover multiple licenses, as we've
done with `LicenseRef-Fedora-Public-Domain` and
`LicenseRef-Fedora-UltraPermissive`. The important thing is that it be
well defined in some way.

> However I really don't understand the purpose or goal of this idea of
> creating a large expression of all the license/permission notices that
> might be found in the sources that possibly make up the binary to the
> License tag. Because in my experience those are often not relevant or
> accurate at all.

The more important objective is to have package reviewers periodically
review packages to ensure that all licenses are 'allowed' by Fedora,
or covered by some documented exception. This activity should then
make it possible to create a license tag reflecting the licenses found
(to the extent they are relevant to the Fedora context beyond merely
existing in the source code). "Relevant" includes any situation where
a license covers something in an installable package. I think we may
disagree on the "covers" part.

This is not really a new requirement. The idea that the license tag
should reflect the various licenses that apply to a binary is
something that has been in place for ~15 years, though it may have
been documented in an inconsistent or contradictory way. I think the
use of umbrella identifiers in the Callaway system somewhat obscured
the fact that this was the expected approach.

> > > What is the goal of dropping the effective license and make packagers
> > > list all the licences of some code snippets originally incorporated
> > > under lax-permissive licenses? Is that not just make work for the
> > > packager if upsteam just uses one effective license?
> >
> > One rationale is given in Fedora legal documentation:
> > "There is no agreed-upon set of criteria or rules under which one can
> > make conclusions about “effective” licenses or reduce composite
> > license expressions to something simpler."
>
> Isn't that not just like most other things fedora, we follow
> upstream. Upstream states the (effective) license and we just adopt
> that. If we notice that there might be a bug and the effective license
> isn't exactly as the upstream project states, then we fix that
> upstream?

I basically don't recognize "effective license" as a valid concept. I
see people using it, perhaps increasingly, but I never see any
definition of what it means.
It sounds like you are using it to mean "whatever the upstream project
seems to say the license is, despite possible evidence to the
contrary".  I'm not sure that's how other people are using "effective
license".

I think Jilayne would disagree with this, but in practice, I also
don't see what we could fix upstream, since there is no standard for
how you communicate or document what the effective license is
(regardless of what it means). The only related standard I know of for
documenting licensing of projects is REUSE (https://reuse.software)
which I think implicitly also rejects the concept of "effective
licensing".

> > Basically, everyone has been making up their own interpretive system
> > for deciding what an "effective license" is, with no consistencies
> > across upstream packages and Fedora package maintainers.
>
> Is this really a problem? Could you show an example where an upstream
> or package maintainer stated in the license tag that the effective
> license was say "GPLv3+", but it would have been more "correct" to state
> that it was "GPL-3.0-or-later AND GPL-3.0-or-later WITH
> Autoconf-exception-generic-3.0 AND GPL-3.0-or-later WITH
> Bison-exception-2.2 AND GPL-2.0-or-later AND GPL-2.0-or-later WITH
> Autoconf-exception-generic AND LGPL-2.1-or-later AND LGPL-2.0-or-later
> AND X11"?

I will not argue this, but I will make two observations. One is
something I've said before, which is that people seem to be
complaining about the current standards for license tags only when
they are lengthy. I think it would be more consistent to argue that we
don't need license tags at all. I have no attachment to RPM-style
license tags, though Red Hat finds them marginally useful for some
purposes.

The other thing is that the discipline that produces license tags at
this level of detail is what is needed to uncover licensing problems
in packages, from Fedora's perspective as a distribution that aims to
be made up of free software. That is, the detailed license tags are a
side effect of a valuable license review process and I would be
concerned that falling back on an effective license approach would
result in the loss of the benefits we get from that process, which
actually long precede the abandonment of the Callaway system.

You've contributed to glibc, so you probably know that for many years
(almost 20 years?) glibc gave the impression that its license, its
effective license if you will, was LGPLv2.1 or later (except for parts
that are under the GPL) but it included a substantial amount of code
under a license that, by the mid-2000s, Debian and later Fedora came
to regard as non-free. I am speaking of the famous Sun RPC license,
which prohibits distribution in isolation, a common type of
proprietary license restriction.

In that scenario, if you had a license tag that just says
"LGPL-2.1-or-later" you are concealing the fact that there is also
some code under a license that cannot be assimilated with LGPL (other
than by adopting a clever post hoc interpretation which cannot
possibly be what Sun Microsystems had in mind in the 1980s) and that
is not even free software. It seems to me that at the very least the
license tag, if you're going to have license tags at all, should say
"LGPL-2.1-or-later AND LicenseRef-Assorted-Other-Free-Software AND
LicenseRef-sun-rpc". But if there's a practice of just relying on
whatever the effective license seems to be, you would be inclined not
to notice a license like this in the first place. This is why the
issue was first surfaced by Debian, I think. To your later point about
Debian copyright files, it is obviously true that you don't need to
have a license tag system like Fedora has for this to happen.

While the Sun RPC problem *may* have been excised from glibc, just
last year we found another license in glibc (and at least one other
package), this time an IBM license [1], that we consider non-free by
present day standards, in that case because it involves a patent
license grant that discriminates according to specific use cases. I
think we should aspire to finding, *exposing*, and fixing these kinds
of problems. Exposing should mean at a minimum that we don't
perpetuate a community-wide decades-old practice of covering these
problems up, which seems to be one practical effect of indulging in
effective licensing. I realize all this doesn't itself justify the
resulting use of complex composite SPDX expressions.

> It seems that the "enumeration" expression is not that easy to create
> objectively. If only because it is actually hard to know which sources
> to scan to get the license/permission snippets (just the upstream tar
> ball, the sources as created by fedpkg prep, those actually included
> in the binaries which depend on the build environment, etc.)

Similar points have been raised by others. I think a good solution is
to reformulate what we mean by "enumeration" so that it is more
practical for Fedora package maintainers. I don't think it is a good
solution to just give up and no longer review package source code for
inclusion of licenses that conflict with Fedora licensing policies.

> And what is the actual purpose and goal of including them in the spec License
> tag?

For me, it's that there's no good argument for throwing away the
information once you have it. You've reviewed a given package and
let's say you've identified five applicable licenses (let's assume we
know what applicable means). How do you then decide what information
to hide? I think you are saying, "review the package thoroughly, but
don't report what you find in the license tag, just pick the
license(s) the upstream project indicates are effective". As suggested
above, I'm not opposed to something similar to this approach (I think
Jilayne would disagree though) provided that we always expose licenses
that are not classified as 'allowed' for Fedora.

> > Also, I don't think "snippets" are the typical case. Often the non-GPL
> > license will appear to cover a whole file or perhaps a set of multiple
> > files. I have found it somewhat common for a Fedora package to include
> > multiple "merely aggregated" works which may be under the GPL and
> > other licenses. That's mere aggregation based on the license steward's
> > traditional guidance on interpretation of the GPL. In those scenarios
> > attempts to apply an effective license theory that ignores the non-GPL
> > license seem to embody a misunderstanding of the orthodox
> > interpretation of the GPL.
>
> That is not my experience from working on some larger code bases.  For
> example when we were integrating GNU Classpath/IcedTea/OpenJDK I went
> over all the code to make sure we could merge the code bases. The top
> level LICENSE file explains the (effective) licenses. And every source
> code file has a header explaining the (effective) license for that
> file (GPLv2 or GPLv2-plus-classpath-exception). But it also includes
> lots of notices like:
>
> /*
>  * This file is available under and governed by the GNU General Public
>  * License version 2 only, as published by the Free Software Foundation.
>  * However, the following notice accompanied the original version of this
>  * file and, per its terms, should not be removed:
>  *
>  * Copyright (c) 2004 World Wide Web Consortium,
>  *
>  * (Massachusetts Institute of Technology, European Research Consortium for
>  * Informatics and Mathematics, Keio University). All Rights Reserved. This
>  * work is distributed under the W3C(r) Software License [1] in the hope that
>  * it will be useful, but WITHOUT ANY WARRANTY; without even the implied
>  * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>  *
>  * [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231
>  */

So in that example, I understand you believe that there is an
effective license, but it can't be said that the W3C license is not
also an applicable license, or, if it is, then why should it not be
removed?

You can argue that the W3C license isn't worth including in the
license tag, but that requires some formulation of a policy for what
kinds of licenses can and can't be excluded. ,

> Something similar is done in glibc. For example several files I
> contributed to were adapted from some BSD release and have a file
> header saying the file is copyright the Free Software Foundation,
> Inc. This file is part of the GNU C Library. And the state they are
> distributed under the GNU Lesser General Public License 2.1 or
> later. But also have the original BSD notice in the file:
>
> /*-
>  * Copyright (c) 1990, 1993, 1994
>  *      The Regents of the University of California.  All rights reserved.
>  *
>  * Redistribution and use in source and binary forms, with or without
>  * modification, are permitted provided that the following conditions
>  * are met:
>  * 1. Redistributions of source code must retain the above copyright
>  *    notice, this list of conditions and the following disclaimer.
>  * 2. Redistributions in binary form must reproduce the above copyright
>  *    notice, this list of conditions and the following disclaimer in the
>  *    documentation and/or other materials provided with the distribution.
>  * 4. Neither the name of the University nor the names of its contributors
>  *    may be used to endorse or promote products derived from this software
>  *    without specific prior written permission.
>  *
>  * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
>  * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
>  * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
>  * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
>  * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
>  * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
>  * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
>  * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
>  * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
>  * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
>  * SUCH DAMAGE.
>  */
>
> But this is not the (effective) licenses, and there is no way to use
> the code under that license, since all contributions since 1994 have
> been done under the LGPL.

Again, someone is making an assumption that something is there that is
still subject to that license, because otherwise it could be removed.
In review of Fedora packages over the past year, we have found a
number of cases where it seems clear a license notice no longer
applies to anything in the package, or never applied in the first
place. In at least one of those cases we recommended to the upstream
project that it remove the "phantom" license notice.

> Likewise for valgrind we have examples of the above. For example the
> dhat tool which have a GPLv2+ copyright and license header, but also
> say:
>
> /*
>    Parts of this file are derived from Firefox, copyright Mozilla Foundation,
>    and may be redistributed under the terms of the Mozilla Public License
>    Version 2.0, as well as under the license of this project.  A copy of the
>    Mozilla Public License Version 2.0 is available at at
>    https://www.mozilla.org/en-US/MPL/2.0/.
> */
>
> Again, although there is a reference to MPLv2 here, the code is only
> available under GPLv2+.

But that notice literally says there is code available under MPL 2.0.

If the notice is incorrect, that is a bug that should be fixed
upstream. But a mere conflict with a project's conception of what its
effective license is would not mean that the license notice is
incorrect.


> > When you say "upstream just uses one effective license", unfortunately
> > it is rarely so clear even if you set aside the basic lack of clarity
> > on what an effective license even is.
>
> In practice upstream often simply has a top-level COPYRIGHT, LICENSE
> or README file where they state the (effective) license, which covers
> the project as a whole.

Unfortunately, a common situation is that the license stated in the
LICENSE or README file cannot or does not account for the presence of
subparts stated as being under other licenses. We certainly can't
pretend those other licenses don't exist for purposes of review for
conformance to Fedora licensing policy. Again, we could adopt a
different approach for license tags but I don't think it's workable to
say "Exclude all license identifiers covering things you've found,
other than the one that seems closest to what's communicated in the
LICENSE or README file". (Although I'd note this actually is very
close to the current unsatisfactory Fedora approach to dealing with
what license texts to include in /usr/share/licenses.)

> > The goal of the policy is not to promote reusability of isolated
> > source code elements under particular associated licenses. Rather, as
> > I see it, license metadata is fundamentally unnecessary, but if we are
> > going to have license metadata at all, and for better or worse I think
> > there is an expectation that we must, it might as well strive for
> > accuracy and consistency with metadata across all packages.
>
> I don't think trying to include SPDX expression for all
> license/permission snippets some source scanning tools can find is
> really accuracte or consistent. It will both miss some sources that do
> end up in the binaries and report some that won't or that aren't
> actually correctly describing the license permissions.
>
> Maybe doing what Debian does would better match your goal. Include a
> good faith attempt to include all copyright and license strings found
> in the upstream sources. Since we always distribute the sources
> someone who cares can then get those and look for the exact context
> and meaning of those strings.

This was proposed by someone else in a recent thread. I like this idea
(and think of how it would allow substantial reuse of work done by,
and future collaboration with, Debian) but it would be the most
radical change to how Fedora deals with licenses and license metadata
in its entire history, so I don't want to be the one to propose it.:)
Maybe this was a mistake, but in revamping the documentation and
guidelines around licensing last year we placed a lot of emphasis on
continuity with past/existing practices.

[1] Following our bringing this to IBM's attention, IBM has agreed
informally to relicense this code under the MIT license. However I
have a long overdue response to a glibc contributor regarding how to
accomplish this.

Richard
_______________________________________________
legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue




[Index of Archives]     [Fedora Users]     [Fedora Desktop]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite News]     [Gnome Users]     [KDE Users]

  Powered by Linux