Re: the license of diskprediction's pre-trained models and more

Shengjing Zhu <zhsj@xxxxxxxxxx> · Mon, 1 Oct 2018 20:26:26 +0800

On Mon, Oct 01, 2018 at 06:59:45PM +0800, kefu chai wrote:
> i am wondering if we could move further by providing user the
> pre-labeled SMART dataset of all listed combination of SMART
> attributes combination in config.json ,  script and document for
> training them, if only commodity hardware and free software are
> required to process the dataset.

IMHO, this usually needs proprietary software like CUDA, maybe we're not
in the free/libre era for machine learning yet...

On Mon, Oct 1, 2018 at 7:37 PM John Spray <jspray@xxxxxxxxxx> wrote:
> If we consider these files to be software, then it's correct to say
> that a public domain binary is non-free.  If we consider them data,
> then a public domain binary is just a piece of data (analogous to
> distributing a .jpeg file but not the photographer's original .raw
> file).  I would lean toward the second view -- in my view, machine
> learning datasets are not source code, as they're numeric data rather
> than computer instructions.
>

My point is whether user can modify it. For picture, without .raw, we
can still modify the .jpeg with GIMP. For font(usually another case), we
can modify .ttf/.otf with fontforge. Both GIMP and fontforge are free
softwares.

Is it possible to use free software to modify the .joblib files? With my
silly knowledge of machine learning, I see scikit-learn can only load
and use them?

-- 
Shengjing Zhu