On Sun, Sep 17, 2023 at 11:37 AM Mark Wielaard <mark@xxxxxxxxx> wrote: > > To be clear I don't mind using a different set of short-hands in the > License tags. Although it feels a little odd to try to create separate > identifiers for lax-permissive MIT/BSD like licenses which sometimes > just different in one or two words. FWIW, usually a difference of one or two words wouldn't be enough to result in creation of a distinct SPDX identifier. The standard applied by SPDX is, informally, whether the difference is "legally substantive" (this has its flaws but seems to work OK in practice). I think anyone should be free to propose a new umbrella identifier (in SPDX expression format) that would cover multiple licenses, as we've done with `LicenseRef-Fedora-Public-Domain` and `LicenseRef-Fedora-UltraPermissive`. The important thing is that it be well defined in some way. > However I really don't understand the purpose or goal of this idea of > creating a large expression of all the license/permission notices that > might be found in the sources that possibly make up the binary to the > License tag. Because in my experience those are often not relevant or > accurate at all. The more important objective is to have package reviewers periodically review packages to ensure that all licenses are 'allowed' by Fedora, or covered by some documented exception. This activity should then make it possible to create a license tag reflecting the licenses found (to the extent they are relevant to the Fedora context beyond merely existing in the source code). "Relevant" includes any situation where a license covers something in an installable package. I think we may disagree on the "covers" part. This is not really a new requirement. The idea that the license tag should reflect the various licenses that apply to a binary is something that has been in place for ~15 years, though it may have been documented in an inconsistent or contradictory way. I think the use of umbrella identifiers in the Callaway system somewhat obscured the fact that this was the expected approach. > > > What is the goal of dropping the effective license and make packagers > > > list all the licences of some code snippets originally incorporated > > > under lax-permissive licenses? Is that not just make work for the > > > packager if upsteam just uses one effective license? > > > > One rationale is given in Fedora legal documentation: > > "There is no agreed-upon set of criteria or rules under which one can > > make conclusions about “effective” licenses or reduce composite > > license expressions to something simpler." > > Isn't that not just like most other things fedora, we follow > upstream. Upstream states the (effective) license and we just adopt > that. If we notice that there might be a bug and the effective license > isn't exactly as the upstream project states, then we fix that > upstream? I basically don't recognize "effective license" as a valid concept. I see people using it, perhaps increasingly, but I never see any definition of what it means. It sounds like you are using it to mean "whatever the upstream project seems to say the license is, despite possible evidence to the contrary". I'm not sure that's how other people are using "effective license". I think Jilayne would disagree with this, but in practice, I also don't see what we could fix upstream, since there is no standard for how you communicate or document what the effective license is (regardless of what it means). The only related standard I know of for documenting licensing of projects is REUSE (https://reuse.software) which I think implicitly also rejects the concept of "effective licensing". > > Basically, everyone has been making up their own interpretive system > > for deciding what an "effective license" is, with no consistencies > > across upstream packages and Fedora package maintainers. > > Is this really a problem? Could you show an example where an upstream > or package maintainer stated in the license tag that the effective > license was say "GPLv3+", but it would have been more "correct" to state > that it was "GPL-3.0-or-later AND GPL-3.0-or-later WITH > Autoconf-exception-generic-3.0 AND GPL-3.0-or-later WITH > Bison-exception-2.2 AND GPL-2.0-or-later AND GPL-2.0-or-later WITH > Autoconf-exception-generic AND LGPL-2.1-or-later AND LGPL-2.0-or-later > AND X11"? I will not argue this, but I will make two observations. One is something I've said before, which is that people seem to be complaining about the current standards for license tags only when they are lengthy. I think it would be more consistent to argue that we don't need license tags at all. I have no attachment to RPM-style license tags, though Red Hat finds them marginally useful for some purposes. The other thing is that the discipline that produces license tags at this level of detail is what is needed to uncover licensing problems in packages, from Fedora's perspective as a distribution that aims to be made up of free software. That is, the detailed license tags are a side effect of a valuable license review process and I would be concerned that falling back on an effective license approach would result in the loss of the benefits we get from that process, which actually long precede the abandonment of the Callaway system. You've contributed to glibc, so you probably know that for many years (almost 20 years?) glibc gave the impression that its license, its effective license if you will, was LGPLv2.1 or later (except for parts that are under the GPL) but it included a substantial amount of code under a license that, by the mid-2000s, Debian and later Fedora came to regard as non-free. I am speaking of the famous Sun RPC license, which prohibits distribution in isolation, a common type of proprietary license restriction. In that scenario, if you had a license tag that just says "LGPL-2.1-or-later" you are concealing the fact that there is also some code under a license that cannot be assimilated with LGPL (other than by adopting a clever post hoc interpretation which cannot possibly be what Sun Microsystems had in mind in the 1980s) and that is not even free software. It seems to me that at the very least the license tag, if you're going to have license tags at all, should say "LGPL-2.1-or-later AND LicenseRef-Assorted-Other-Free-Software AND LicenseRef-sun-rpc". But if there's a practice of just relying on whatever the effective license seems to be, you would be inclined not to notice a license like this in the first place. This is why the issue was first surfaced by Debian, I think. To your later point about Debian copyright files, it is obviously true that you don't need to have a license tag system like Fedora has for this to happen. While the Sun RPC problem *may* have been excised from glibc, just last year we found another license in glibc (and at least one other package), this time an IBM license [1], that we consider non-free by present day standards, in that case because it involves a patent license grant that discriminates according to specific use cases. I think we should aspire to finding, *exposing*, and fixing these kinds of problems. Exposing should mean at a minimum that we don't perpetuate a community-wide decades-old practice of covering these problems up, which seems to be one practical effect of indulging in effective licensing. I realize all this doesn't itself justify the resulting use of complex composite SPDX expressions. > It seems that the "enumeration" expression is not that easy to create > objectively. If only because it is actually hard to know which sources > to scan to get the license/permission snippets (just the upstream tar > ball, the sources as created by fedpkg prep, those actually included > in the binaries which depend on the build environment, etc.) Similar points have been raised by others. I think a good solution is to reformulate what we mean by "enumeration" so that it is more practical for Fedora package maintainers. I don't think it is a good solution to just give up and no longer review package source code for inclusion of licenses that conflict with Fedora licensing policies. > And what is the actual purpose and goal of including them in the spec License > tag? For me, it's that there's no good argument for throwing away the information once you have it. You've reviewed a given package and let's say you've identified five applicable licenses (let's assume we know what applicable means). How do you then decide what information to hide? I think you are saying, "review the package thoroughly, but don't report what you find in the license tag, just pick the license(s) the upstream project indicates are effective". As suggested above, I'm not opposed to something similar to this approach (I think Jilayne would disagree though) provided that we always expose licenses that are not classified as 'allowed' for Fedora. > > Also, I don't think "snippets" are the typical case. Often the non-GPL > > license will appear to cover a whole file or perhaps a set of multiple > > files. I have found it somewhat common for a Fedora package to include > > multiple "merely aggregated" works which may be under the GPL and > > other licenses. That's mere aggregation based on the license steward's > > traditional guidance on interpretation of the GPL. In those scenarios > > attempts to apply an effective license theory that ignores the non-GPL > > license seem to embody a misunderstanding of the orthodox > > interpretation of the GPL. > > That is not my experience from working on some larger code bases. For > example when we were integrating GNU Classpath/IcedTea/OpenJDK I went > over all the code to make sure we could merge the code bases. The top > level LICENSE file explains the (effective) licenses. And every source > code file has a header explaining the (effective) license for that > file (GPLv2 or GPLv2-plus-classpath-exception). But it also includes > lots of notices like: > > /* > * This file is available under and governed by the GNU General Public > * License version 2 only, as published by the Free Software Foundation. > * However, the following notice accompanied the original version of this > * file and, per its terms, should not be removed: > * > * Copyright (c) 2004 World Wide Web Consortium, > * > * (Massachusetts Institute of Technology, European Research Consortium for > * Informatics and Mathematics, Keio University). All Rights Reserved. This > * work is distributed under the W3C(r) Software License [1] in the hope that > * it will be useful, but WITHOUT ANY WARRANTY; without even the implied > * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > * > * [1] http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231 > */ So in that example, I understand you believe that there is an effective license, but it can't be said that the W3C license is not also an applicable license, or, if it is, then why should it not be removed? You can argue that the W3C license isn't worth including in the license tag, but that requires some formulation of a policy for what kinds of licenses can and can't be excluded. , > Something similar is done in glibc. For example several files I > contributed to were adapted from some BSD release and have a file > header saying the file is copyright the Free Software Foundation, > Inc. This file is part of the GNU C Library. And the state they are > distributed under the GNU Lesser General Public License 2.1 or > later. But also have the original BSD notice in the file: > > /*- > * Copyright (c) 1990, 1993, 1994 > * The Regents of the University of California. All rights reserved. > * > * Redistribution and use in source and binary forms, with or without > * modification, are permitted provided that the following conditions > * are met: > * 1. Redistributions of source code must retain the above copyright > * notice, this list of conditions and the following disclaimer. > * 2. Redistributions in binary form must reproduce the above copyright > * notice, this list of conditions and the following disclaimer in the > * documentation and/or other materials provided with the distribution. > * 4. Neither the name of the University nor the names of its contributors > * may be used to endorse or promote products derived from this software > * without specific prior written permission. > * > * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND > * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE > * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE > * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE > * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL > * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS > * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) > * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT > * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY > * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF > * SUCH DAMAGE. > */ > > But this is not the (effective) licenses, and there is no way to use > the code under that license, since all contributions since 1994 have > been done under the LGPL. Again, someone is making an assumption that something is there that is still subject to that license, because otherwise it could be removed. In review of Fedora packages over the past year, we have found a number of cases where it seems clear a license notice no longer applies to anything in the package, or never applied in the first place. In at least one of those cases we recommended to the upstream project that it remove the "phantom" license notice. > Likewise for valgrind we have examples of the above. For example the > dhat tool which have a GPLv2+ copyright and license header, but also > say: > > /* > Parts of this file are derived from Firefox, copyright Mozilla Foundation, > and may be redistributed under the terms of the Mozilla Public License > Version 2.0, as well as under the license of this project. A copy of the > Mozilla Public License Version 2.0 is available at at > https://www.mozilla.org/en-US/MPL/2.0/. > */ > > Again, although there is a reference to MPLv2 here, the code is only > available under GPLv2+. But that notice literally says there is code available under MPL 2.0. If the notice is incorrect, that is a bug that should be fixed upstream. But a mere conflict with a project's conception of what its effective license is would not mean that the license notice is incorrect. > > When you say "upstream just uses one effective license", unfortunately > > it is rarely so clear even if you set aside the basic lack of clarity > > on what an effective license even is. > > In practice upstream often simply has a top-level COPYRIGHT, LICENSE > or README file where they state the (effective) license, which covers > the project as a whole. Unfortunately, a common situation is that the license stated in the LICENSE or README file cannot or does not account for the presence of subparts stated as being under other licenses. We certainly can't pretend those other licenses don't exist for purposes of review for conformance to Fedora licensing policy. Again, we could adopt a different approach for license tags but I don't think it's workable to say "Exclude all license identifiers covering things you've found, other than the one that seems closest to what's communicated in the LICENSE or README file". (Although I'd note this actually is very close to the current unsatisfactory Fedora approach to dealing with what license texts to include in /usr/share/licenses.) > > The goal of the policy is not to promote reusability of isolated > > source code elements under particular associated licenses. Rather, as > > I see it, license metadata is fundamentally unnecessary, but if we are > > going to have license metadata at all, and for better or worse I think > > there is an expectation that we must, it might as well strive for > > accuracy and consistency with metadata across all packages. > > I don't think trying to include SPDX expression for all > license/permission snippets some source scanning tools can find is > really accuracte or consistent. It will both miss some sources that do > end up in the binaries and report some that won't or that aren't > actually correctly describing the license permissions. > > Maybe doing what Debian does would better match your goal. Include a > good faith attempt to include all copyright and license strings found > in the upstream sources. Since we always distribute the sources > someone who cares can then get those and look for the exact context > and meaning of those strings. This was proposed by someone else in a recent thread. I like this idea (and think of how it would allow substantial reuse of work done by, and future collaboration with, Debian) but it would be the most radical change to how Fedora deals with licenses and license metadata in its entire history, so I don't want to be the one to propose it.:) Maybe this was a mistake, but in revamping the documentation and guidelines around licensing last year we placed a lot of emphasis on continuity with past/existing practices. [1] Following our bringing this to IBM's attention, IBM has agreed informally to relicense this code under the MIT license. However I have a long overdue response to a glibc contributor regarding how to accomplish this. Richard _______________________________________________ legal mailing list -- legal@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to legal-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/legal@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue