On Mon, Oct 01, 2018 at 06:59:45PM +0800, kefu chai wrote: > i am wondering if we could move further by providing user the > pre-labeled SMART dataset of all listed combination of SMART > attributes combination in config.json , script and document for > training them, if only commodity hardware and free software are > required to process the dataset. IMHO, this usually needs proprietary software like CUDA, maybe we're not in the free/libre era for machine learning yet... On Mon, Oct 1, 2018 at 7:37 PM John Spray <jspray@xxxxxxxxxx> wrote: > If we consider these files to be software, then it's correct to say > that a public domain binary is non-free. If we consider them data, > then a public domain binary is just a piece of data (analogous to > distributing a .jpeg file but not the photographer's original .raw > file). I would lean toward the second view -- in my view, machine > learning datasets are not source code, as they're numeric data rather > than computer instructions. > My point is whether user can modify it. For picture, without .raw, we can still modify the .jpeg with GIMP. For font(usually another case), we can modify .ttf/.otf with fontforge. Both GIMP and fontforge are free softwares. Is it possible to use free software to modify the .joblib files? With my silly knowledge of machine learning, I see scikit-learn can only load and use them? -- Shengjing Zhu