# Table of Contents

-   [take notes](#tknts) 
-   [make bookmarks on youtube videos/vlc via voice](#mkbkmrksnytbvdsvlcvvc) 
    -   [https://chrome.google.com/webstore/detail/lipsurf/lnnmjmalakahagblkkcnjkoaihlfglon lipsurf! nice thing!](#schrmgglcmwbstrdtllpsrflnblkkcnjkhlfglnlpsrfncthng) 
-   [Voice pause vlc or music](#vcpsvlcrmsc) 
-   [`[2018-08-27]` Michael Sheldon's Stuff » Speech Recognition – Mozilla’s DeepSpeech, GStreamer and IBus](#mchlshldnsstffspchrcgntnmzllsdpspchgstrmrndbs) 
-   [using voice to code](#sngvctcd) [[programming]]
-   [`[2018-08-27]` Is there any decent speech recognition software for Linux? - Unix & Linux Stack Exchange](#sthrnydcntspchrcgntnsftwrfrlnxnxlnxstckxchng) 
    -   [add to kibitzr? or some similar tool for tracking, apparently there are no subscriptions on stackexchange :(](#ddtkbtzrrsmsmlrtlfrtrckngythrrnsbscrptnsnstckxchng) 
        -   [`[2021-01-11]` oh very nice; in has RSS feed](#hvryncnhsrssfd) 
-   [python speech recognition https://www.reddit.com/r/Python/comments/86440q/the\_ultimate\_guide\_to\_speech\_recognition\_with/](#pythnspchrcgntnswwwrddtcmtsqthltmtgdtspchrcgntnwth) 
-   [Mozilla goes multilingual with open source Common Voice speech recognition datasets](#mzllgsmltlnglwthpnsrccmmnvcspchrcgntndtsts) 
-   [Build a voice interface in three minutes with PORCUPiNE](#bldvcntrfcnthrmntswthprcpn) 
    -   [`[2018-05-08]` https://github.com/Picovoice/Porcupine](#sgthbcmpcvcprcpn) 
-   [`[2018-10-03]` ugh, researched a bit and it seems like there are no decent russian models.. dunno maybe I could train one against mozilla's deepspeech??](#ghrsrchdbtndtsmslkthrrndcmybcldtrnngnstmzllsdpspch) 
-   [TTS on linux https://notgnoshi.github.io/spd-say/](#ttsnlnxsntgnshgthbspdsy) 
-   [https://www.reddit.com/r/linux/comments/76611l/voice\_to\_text\_software\_for\_linux/](#swwwrddtcmrlnxcmmntslvcttxtsftwrfrlnx) 
-   [ugh all the linux soft for speech recognition seems to be a bit shit :(](#ghllthlnxsftfrspchrcgntnsmstbbtsht) 
-   [mozilla voice dataset? https://blog.mozilla.org/blog/2017/11/29/announcing-the-initial-release-of-mozillas-open-source-speech-recognition-model-and-voice-dataset](#mzllvcdtstsblgmzllrgblgnnnsrcspchrcgntnmdlndvcdtst) 
    -   [https://github.com/mozilla/DeepSpeech/issues/1181](#sgthbcmmzlldpspchsss) 
-   [https://github.com/DragonComputer/Dragonfire](#sgthbcmdrgncmptrdrgnfr) 
-   [just take a look at kaldi?](#jsttklktkld) 
-   [https://www.reddit.com/r/Python/comments/6zzqvi/speechpy\_a\_library\_for\_speech\_processing\_and/](#swwwrddtcmrpythncmmntszzqpchpylbrryfrspchprcssngnd) 
-   [.](#6529_7491) 
-   [https://en.wikipedia.org/wiki/List\_of\_speech\_recognition\_software](#snwkpdrgwklstfspchrcgntnsftwr) 
-   [kinda overlaps with file:speech-recognition.org](#kndvrlpswthflspchrcgntnrg) 
-   [http://tuxdiary.com/2015/05/25/lispeak/ lispeak &#x2013; apparently used old google api and is dead now](#txdrycmlspklspkpprntlysdldgglpndsddnw) 
-   [related](#rltd) [[degoogle]] [[automation]] [[infra]] [[desktop]]
-   [configure simon](#cnfgrsmn) 
-   [https://github.com/julius-speech/julius](#sgthbcmjlsspchjls) 
-   [`[2019-10-19]` Common Voice](#cmmnvc) 
-   [`[2020-01-01]` Rhasspy is an open source, fully offline voice assistant toolkit](#snwsycmbntrcmtmdrhsspysnpnsrcfllyfflnvcssstnttlkt) 
-   [`[2019-06-25]` Challenges in open source voice interfaces | Opensource.com](#chllngsnpnsrcvcntrfcspnsrccm) [[mycroft]]
    -   [`[2019-07-25]` good article breaking down the pipeline of voice assictant](#gdrtclbrkngdwnthpplnfvcssctnt) 
-   [`[2018-10-31]` tried installing dragon 15 on win 8.1 VM but it would fail with https://nuance.custhelp.com/app/answers/detail/a\_id/5684 :(](#trdnstllngdrgnnwnvmbttwldthsnnccsthlpcmppnswrsdtld) [[dictation]]
-   [`[2020-10-21]` Hands-Free Coding: How I develop software using dictation and eye-tracking | Hacker News](#snwsycmbntrcmtmdhndsfrcdnrsngdcttnndytrcknghckrnws) [[dictation]]
-   [`[2019-06-29]` Elleo/gst-deepspeech: Speech recognition plugin for GStreamer based on Mozilla's DeepSpeech model https://github.com/Elleo/gst-deepspeech](#llgstdpspchspchrcgntnplgnspchmdlsgthbcmllgstdpspch) [[dictation]]
-   [`[2019-12-19]` at16k/at16k: Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.  https://github.com/at16k/at16k](#tktktrndmdlsfrtmtcspchrcgspchttxtcnvrsnsgthbcmtktk) 

It would be nice to set up some basic voice assistant, at least for very simple things.  
It's a shame I can't do simple things with voice when I'm not at the keyboard or my hands are busy:  


# take notes

that should be my first priority in implementing dictation  
could have start/stop commands and just record everything in between  

<https://www.cnet.com/how-to/google-home-essential-ifttt-applets/>  
<https://support.google.com/googlehome/answer/7382893?hl=en-GB>  
<https://ifttt.com/recipes/480943-google-assistant-to-remember-the-milk>  
autovoice???  


# make bookmarks on youtube videos/vlc via voice


## <https://chrome.google.com/webstore/detail/lipsurf/lnnmjmalakahagblkkcnjkoaihlfglon> lipsurf! nice thing!

<https://www.reddit.com/r/speechrecognition/comments/8iedgm/my_voice_control_for_chrome_extension_that_i/dyx47bc/>  


# Voice pause vlc or music

    I'm using a somewhat messy workaround. I have installed KDEConnect, and use my smartphone as an input device. This allows me to use Androids dictation feature and works reasonably well
    To everyone reading: you can help improve voice recognition in (open source) applications by donating your voice to Mozilla's Common Voice project.

<https://github.com/zmyaro/chrome-voice-actions>  


# `[2018-08-27]` Michael Sheldon's Stuff » Speech Recognition – Mozilla’s DeepSpeech, GStreamer and IBus

<http://blog.mikeasoft.com/2017/12/30/speech-recognition-mozillas-deepspeech-gstreamer-and-ibus/>  


# using voice to code      [[programming]]

<http://ergoemacs.org/emacs/using_voice_to_code.html>  

    Comedian Dan Nainan has stopped typing and opted instead for voice recognition software Dragon Dictate on his smartphone and computer. He speaks into the phone or microphone and the software transcribes his words.
    “It’s freaking unbelievable, and it makes me so much more productive. I’m using it constantly, both at my desk and when I’m walking around. I have the new version for the phone that lets me dictate anywhere—waiting for a plane, backstage, or even walking in Manhattan,” he says. While working at Intel, Nainan had repetitive stress injuries in his wrists, so the technology has been a “godsend,” he says. While he says “training” the software to recognize your own speech patterns and accent isn’t necessary, the platform is more efficient when you take the time to do so, he says.


# `[2018-08-27]` Is there any decent speech recognition software for Linux? - Unix & Linux Stack Exchange

<https://unix.stackexchange.com/questions/256138/is-there-any-decent-speech-recognition-software-for-linux>  


## add to kibitzr? or some similar tool for tracking, apparently there are no subscriptions on stackexchange :(


### `[2021-01-11]` oh very nice; in has RSS feed


# python speech recognition <https://www.reddit.com/r/Python/comments/86440q/the_ultimate_guide_to_speech_recognition_with/>


# Mozilla goes multilingual with open source Common Voice speech recognition datasets

<https://venturebeat.com/2018/06/07/mozilla-goes-multilingual-with-open-source-common-voice-speech-recognition-datasets/>  


# Build a voice interface in three minutes with PORCUPiNE

<https://www.youtube.com/watch?v=3z7LBW_Rl9c>  


## `[2018-05-08]` <https://github.com/Picovoice/Porcupine>


# `[2018-10-03]` ugh, researched a bit and it seems like there are no decent russian models.. dunno maybe I could train one against mozilla's deepspeech??

this is closest I could find.. <https://github.com/sovse/Rus-SpeechRecognition-LSTM-CTC-VoxForge>  


# TTS on linux <https://notgnoshi.github.io/spd-say/>


# <https://www.reddit.com/r/linux/comments/76611l/voice_to_text_software_for_linux/> 


# ugh all the linux soft for speech recognition seems to be a bit shit :(


# mozilla voice dataset? <https://blog.mozilla.org/blog/2017/11/29/announcing-the-initial-release-of-mozillas-open-source-speech-recognition-model-and-voice-dataset>

apparently python2  
deepspeech command  
ugh, i've got ivy bridge and can't use AVX2. Do I have to build on my own??  
<https://github.com/mind/wheels>  
<https://github.com/lakshayg/tensorflow-build>  

<https://www.reddit.com/r/MLQuestions/comments/8g5lar/is_there_any_way_to_use_mozilla_tensorflow/dyockrx/>  


## <https://github.com/mozilla/DeepSpeech/issues/1181> 


# <https://github.com/DragonComputer/Dragonfire> 

python setup.py develop &#x2013;user  
./dragonfire/\_<sub>init</sub>\_<sub>.py</sub>  
-c for command mode  
ok, the kaldi test ended up running&#x2026;  
<https://github.com/DragonComputer/Dragonfire/blob/master/CONTRIBUTING.md#styleguides> omg&#x2026; emoji guide to commit messages, that guy is a bit ridiculous  


# just take a look at kaldi?

<https://github.com/grib0ed0v/kaldi-for-russian>  
Generally Kaldi is much more accurate than current CMUSphinx, however, if your audio has background noise, both will be quite useless. Music on background significantly affects speech recognition performance.  


# <https://www.reddit.com/r/Python/comments/6zzqvi/speechpy_a_library_for_speech_processing_and/> 


# . 

    Voice Attack: failed to pick up my American accent. Failed to pick up my friend's British accent. Have not been able to try it as it appears to be incapable or recognizing even the shortest words ("Hello" spoken out loudly became "Oh no")
    
    (G)AVPI: I created a profile successfully, open it, and am greeted with the message "No recognizer of the required ID found". The program fails to work at all.
    
    Dragon Naturally Speaking 13 Premium: Amazing voice recognition, these guys did solid work. However, the Premium version does not allow for your own macros, only text. Only the pro version does this, which is $599. Sadly, I could not find a free version of the Pro version through my regular channels. The premium version of this product does not work with Elite Dangerous.
    
    That leaves me out of ideas. Has anyone else found voice recognition software that actually works, and does not cost a fortune?


# <https://en.wikipedia.org/wiki/List_of_speech_recognition_software> 


# kinda overlaps with <speech-recognition.md>


# <http://tuxdiary.com/2015/05/25/lispeak/> lispeak &#x2013; apparently used old google api and is dead now


# related       [[degoogle]] [[automation]] [[infra]] [[desktop]]


# configure simon

    sudo apt install libqt4-sql-sqlite

what's up with default dictionary??  


# <https://github.com/julius-speech/julius> 


# `[2019-10-19]` Common Voice

<https://voice.mozilla.org/en?utm_source=missionmozillians&utm_medium=snippet&utm_campaign=common_voice_volunteers_october&utm_term=21473&utm_content=REL>  


# `[2020-01-01]` [Rhasspy is an open source, fully offline voice assistant toolkit](https://news.ycombinator.com/item?id=21926027)

<https://rhasspy.readthedocs.io/en/latest/>  


# `[2019-06-25]` Challenges in open source voice interfaces | Opensource.com      [[mycroft]]

<https://opensource.com/article/19/1/open-source-voice-interfaces>  


## `[2019-07-25]` good article breaking down the pipeline of voice assictant


# `[2018-10-31]` tried installing dragon 15 on win 8.1 VM but it would fail with <https://nuance.custhelp.com/app/answers/detail/a_id/5684> :(      [[dictation]]


# `[2020-10-21]` [Hands-Free Coding: How I develop software using dictation and eye-tracking | Hacker News](https://news.ycombinator.com/item?id=24846887)      [[dictation]]


# `[2019-06-29]` Elleo/gst-deepspeech: Speech recognition plugin for GStreamer based on Mozilla's DeepSpeech model <https://github.com/Elleo/gst-deepspeech>      [[dictation]]


# `[2019-12-19]` at16k/at16k: Trained models for automatic speech recognition (ASR). A library to quickly build applications that require speech to text conversion.  <https://github.com/at16k/at16k>

    ask what are the differences to deepspeech etc?