Re: the license of diskprediction's pre-trained models and more

Lars Marowsky-Bree <lmb@xxxxxxxx> · Tue, 2 Oct 2018 12:14:07 +0200

On 2018-10-01T12:35:57, John Spray <jspray@xxxxxxxxxx> wrote:

> In my non-legally-trained opinion, when distributing public domain
> files the key distinction is whether these files are considered
> software (i.e. having source code) or just data.

Licenses for data are hard, since software licenses don't apply well.
The closest match to the GPL-style protections, rights, and freedoms
would likely be https://cdla.io/sharing-1-0/

> Still, it would certainly be a good thing to have the original
> training data available, to avoid any possible ambiguity arising from
> differing interpretations, and to make it obvious how others can
> recreate models from alternative source data.

Yes. If the original data are not available (including the goal
functions, hyperparameters etc), it becomes impossible to understand
what the model was trained to do.

That's acceptable for a proprietary offering, but not something that
fits under the idea of an Free/Libre/Open Source project.

Regards,
    Lars

-- 
Architect SDS, Distinguished Engineer
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)
"Architects should open possibilities and not determine everything." (Ueli Zbinden)