2018-4-30 這個玩意失效了,不在乎花錢的請參考 Google Cloud Speech to Text API
--
一開始是這個
參考
--
聲音準備
1 |
ffmpeg -i 原始文件 -ar 16000 输出.flac |
格式不正確就完全辨識不出來
--
curl 指令送出辨識
1 |
curl -X POST --data-binary @1.flac --header 'Content-Type: audio/x-flac; rate=16000;' 'https://www.google.com/speech-api/v2/recognize?client=chromium&output=json&lang=zh-TW&key=AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw' |
--
哪天掛掉怎麼辦?
去 python 看原始碼,直接偷新的作法就好了
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
def recognize_google(self, audio_data, key=None, language="en-US", show_all=False): """ Performs speech recognition on ``audio_data`` (an ``AudioData`` instance), using the Google Speech Recognition API. The Google Speech Recognition API key is specified by ``key``. If not specified, it uses a generic key that works out of the box. This should generally be used for personal or testing purposes only, as it **may be revoked by Google at any time**. To obtain your own API key, simply following the steps on the `API Keys <http://www.chromium.org/developers/how-tos/api-keys>`__ page at the Chromium Developers site. In the Google Developers Console, Google Speech Recognition is listed as "Speech API". The recognition language is determined by ``language``, an RFC5646 language tag like ``"en-US"`` (US English) or ``"fr-FR"`` (International French), defaulting to US English. A list of supported language tags can be found in this `StackOverflow answer <http://stackoverflow.com/a/14302134>`__. Returns the most likely transcription if ``show_all`` is false (the default). Otherwise, returns the raw API response as a JSON dictionary. Raises a ``speech_recognition.UnknownValueError`` exception if the speech is unintelligible. Raises a ``speech_recognition.RequestError`` exception if the speech recognition operation failed, if the key isn't valid, or if there is no internet connection. """ assert isinstance(audio_data, AudioData), "``audio_data`` must be audio data" assert key is None or isinstance(key, str), "``key`` must be ``None`` or a string" assert isinstance(language, str), "``language`` must be a string" flac_data = audio_data.get_flac_data( convert_rate=None if audio_data.sample_rate >= 8000 else 8000, # audio samples must be at least 8 kHz convert_width=2 # audio samples must be 16-bit ) if key is None: key = "AIzaSyBOti4mM-6x9WDnZIjIeyEU21OpBXqWBgw" url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({ "client": "chromium", "lang": language, "key": key, })) request = Request(url, data=flac_data, headers={"Content-Type": "audio/x-flac; rate={}".format(audio_data.sample_rate)}) |
--
2,330 total views, 1 views today