Last year, Mozilla has just opened its speech engine, called DeepSpeech, which can execute task of speech-to-text. This engine is based on the work of ‘Deep Speech: Scaling up end-to-end speech recognition’, which is published on Arxiv by Baidu’s scientists.
Several weeks ago, I wrote a software, called InterviewSecrectary, to transcribe speech/audio to text using the iFLYTEK speech engine. This speech engine is not free. Every user should pay about ￥10 RMB per hour for use.