On Wed, 22 May 2019, J Lovejoy wrote: > > On May 22, 2019, at 3:10 PM, John Sullivan <johns@xxxxxxx> wrote: > >> When a license defines a recommended notice to attach to files under > >> that license (sometimes called a "standard header"), the SPDX project > >> recommends that the standard header be included in the files, in > >> addition to an SPDX ID. > > > >> Additionally, when a file already contains a standard header or other > >> license notice, the SPDX project recommends that those existing notices > >> should not be removed. The SPDX ID is recommended to be used to > >> supplement, not replace, existing notices in files. > > > >> Like copyright notices, existing license texts and notices should be > >> retained, not replaced ‐ especially a third party's license notices. > > > > that text is from the SPDX website and is very generalized, conservative > and non-contextual. The reality we live in today is that people are > choosing to use the SPDX identifiers in their files instead of the full > license text (for MIT) or the standard license notice (for Apache-2.0 or > GPL), etc. - this is good because SPDX identifiers are more concise and > easier for tooling to parse. Even when there is a standard license > header recommended, like the GPL has done, it doesn’t get faithfully > reproduced which causes headaches for tooling to parse even when the > intent is clear. This is what Thomas is dealing with and you can see the > many examples of this on the many other emails on this list. Just to add some more context why we are doing this: The first and most important reason is that - as demonstrated with this work already - the tools are lost on identifying the correct meaning of all the 700+ variations of expressing just GPL licensing terms. We're not talking about the other 80+ license variants (some of them are of dubious nature) yet. Right now as things stand it is simply _impossible_ for SMBs to do proper license compliance on the kernel. That's a situation which we cannot proliferate forever and waiting for everyone and his dog to clean that up on their own will take at the current rate and interest 10 years plus. In fact it will never finish because people are not longer reachable, companies are gone ... As long as that persists any company who cannot afford to pay the price for wading through that mess manually is going to be an easy target for licensing trolls. But even companies who can afford it win a nice excuse why they did not comply as its possible for them to demonstrate that they did all what they could, but the unholy mess is responsible for them to fail. That has been used as an argument successfully already :( If we keep all the silly variants of license references/notices around and just add SPDX identifieres then we are back to square one with this. For compliance you have to scan EVERYTHING which looks like license information. So then you end up with the same heuristic guesswork to figure out whether the SPDX identifiers are matching the random mess we left in place. IOW, we just kept the status quo and the SPDX identifier degenrated to a hint. I appreciate that lawyers are trying to minimize the risk, but can we pretty please be pragmatic and keep the priority on making compliance possible in the first place? That serves everyone, the contributors and the down stream users. FWIW, the same procedure (smaller scale) has been conducted on the u-boot project a few years ago already and to the best of my knowledge nobody has come forth and made a fuzz about that approach. Thanks, Thomas