On 2018-10-01T12:35:57, John Spray <jspray@xxxxxxxxxx> wrote: > In my non-legally-trained opinion, when distributing public domain > files the key distinction is whether these files are considered > software (i.e. having source code) or just data. Licenses for data are hard, since software licenses don't apply well. The closest match to the GPL-style protections, rights, and freedoms would likely be https://cdla.io/sharing-1-0/ > Still, it would certainly be a good thing to have the original > training data available, to avoid any possible ambiguity arising from > differing interpretations, and to make it obvious how others can > recreate models from alternative source data. Yes. If the original data are not available (including the goal functions, hyperparameters etc), it becomes impossible to understand what the model was trained to do. That's acceptable for a proprietary offering, but not something that fits under the idea of an Free/Libre/Open Source project. Regards, Lars -- Architect SDS, Distinguished Engineer SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg) "Architects should open possibilities and not determine everything." (Ueli Zbinden)