Re: the license of diskprediction's pre-trained models and more

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 1, 2018 at 8:26 PM Shengjing Zhu <zhsj@xxxxxxxxxx> wrote:
>
> On Mon, Oct 01, 2018 at 06:59:45PM +0800, kefu chai wrote:
> > i am wondering if we could move further by providing user the
> > pre-labeled SMART dataset of all listed combination of SMART
> > attributes combination in config.json ,  script and document for
> > training them, if only commodity hardware and free software are
> > required to process the dataset.
>
> IMHO, this usually needs proprietary software like CUDA, maybe we're not
> in the free/libre era for machine learning yet...

i believe the SVM classifiers come with pybind/mgr/diskprediction were
trained using sklearn.svm.SVC[0], which in turn is implemented using
libsvm[1,2]. libsvm can be rewrite to take advantage of GPU using
techniques like CUDA. but i don't think it's a must. and it is using a
single classifier for labelling the positive samples. since the models
were trained using SVC, it's a O(n^2) algorithm, where n is the size
of dataset. so it should be acceptable for performing the training
without GPU.

---
[0] http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
[1] https://github.com/scikit-learn/scikit-learn/tree/master/sklearn/svm/src/libsvm
[2] https://www.csie.ntu.edu.tw/~cjlin/libsvm/


>
> On Mon, Oct 1, 2018 at 7:37 PM John Spray <jspray@xxxxxxxxxx> wrote:
> > If we consider these files to be software, then it's correct to say
> > that a public domain binary is non-free.  If we consider them data,
> > then a public domain binary is just a piece of data (analogous to
> > distributing a .jpeg file but not the photographer's original .raw
> > file).  I would lean toward the second view -- in my view, machine
> > learning datasets are not source code, as they're numeric data rather
> > than computer instructions.
> >
>
> My point is whether user can modify it. For picture, without .raw, we
> can still modify the .jpeg with GIMP. For font(usually another case), we
> can modify .ttf/.otf with fontforge. Both GIMP and fontforge are free
> softwares.
>
> Is it possible to use free software to modify the .joblib files? With my
> silly knowledge of machine learning, I see scikit-learn can only load
> and use them?

in addition to consuming them, scikit-learn can be used to create
models. and as i explain in another mail in this thread, user, even
professional, is not supposed to tweak the model manually.

>
> --
> Shengjing Zhu



-- 
Regards
Kefu Chai



[Index of Archives]     [CEPH Users]     [Ceph Large]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux