Am 26.12.21 um 20:51 schrieb Matthew
Miller:
Marius, are the different language packs updated continually and separately, or is there one versioned set of all of them released at intervals? Is it a case where everything is regenerated, or are additions incremental? (And do they _replace_ or just add?)
The language files are seperate for any language. They do not update together.
It more the massive amount of storage space in total that worries me.
The first release would be less than 40G, that was just a size the entire project will reach easily, if it grows
like it did in the past.
It does seem like it'd be nice to have a way to deliver (officially from Fedora in a way that can be shipped in Spins and containers) static files that don't change, without needing to redownload gigabytes on upgrade. Of course, delta RPMs are one way, but need a lot of investment in actually working again. Ostree deltas are another — and maybe upcoming work on container deltas could be helpful.
I don't see a way to reduce the update size, as it mostly one big file:
[marius@eve ~]$ ll /usr/share/pva/vosk-model-de-0.21/
insgesamt 28
drwxr-xr-x. 2 marius marius 4096 21. Aug 2020 am
drwxr-xr-x. 2 marius marius 4096 2. Aug 2020 conf
drwxr-xr-x. 3 marius marius 4096 9. Aug 2020 graph
drwxr-xr-x. 2 marius marius 4096 21. Aug 2020 ivector
-rw-r--r--. 1 marius marius 740 15. Sep 00:21 README
drwxr-xr-x. 2 marius marius 4096 9. Aug 2020 rescore
drwxr-xr-x. 2 marius marius 4096 15. Sep 00:14 rnnlm
[marius@eve ~]$ du -sh /usr/share/pva/vosk-model-de-0.21/*
100M /usr/share/pva/vosk-model-de-0.21/am
12K /usr/share/pva/vosk-model-de-0.21/conf
685M /usr/share/pva/vosk-model-de-0.21/graph
8,2M /usr/share/pva/vosk-model-de-0.21/ivector
4,0K /usr/share/pva/vosk-model-de-0.21/README
2,1G /usr/share/pva/vosk-model-de-0.21/rescore
281M /usr/share/pva/vosk-model-de-0.21/rnnlm
[marius@eve ~]$ ll /usr/share/pva/vosk-model-de-0.21/rescore/
insgesamt 2171812
-rw-r--r--. 1 marius marius 2115929988 14. Sep 20:58 G.carpa
-rw-r--r--. 1 marius marius 107992138 14. Sep 20:50 G.fst
(And... I think it'd be useful in a lot of cases to be able to do dist-git -> container without needing to build RPMs as an intermediate step. But... that's not a thing we have now.)
As far as I understand the packaging rules, autodownloaders are not welcome,
and for security reasons, i absolutly support this.
We could downsize the problem at the beginning, because there are no voice commands ready for other languages, so it does not make sense to
have the language models around. I really hope the project gets a kick start once the first people use. it's quite easy to write a set of commands
and get it running. I suggest a nice feature in the fedora magazin about a working assistent for fedora.
So at the beginning, we talk about 2-4 GB for german and english. the pva itself isn't that storage hungry, a mb at best. A few vosk deps here and there:
~100mb uncompressed maybe.
For now, I'm rebuilding the compile process against our fedora libs, so we can ship the required packages for kaldi & vosk. The required libs shipped with Fedora are older than the actual ones used by vosk devs, which is a problem.
With pip as source for vosk, it works as expected, but the local vosk & kaldi builds do not yet work :(
best regards,
Marius
_______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure