问题描述
我有 wav 格式的语音音频文件,每个文件 60 秒。但是,输出会被截断,并且只捕获了大约 15% 的长度。我已经在我本地的 Jupyter Notebook 和 Google Colab 中尝试过这个。根据文档,此请求低于 API 的阈值。我做错了什么,或者如何绕过这个限制?
# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()
interview = sr.AudioFile('sample.wav')
with interview as source:
print('Ready...')
r.pause_threshold = 2
audio = r.record(source,duration=60)
type(audio)
transcription = r.recognize_google(audio,language='en_CA')
print(transcription)
解决方法
尝试使用此代码,如果输出仍然与旧的相同,您可以识别 try 和 except 块或更改 pause_threshold
值
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("sample.wav") as source:
print("Ready")
r.pause_threshold = 0.6
audio = r.record(source)
try:
s = r.recognize_google(audio)
print("Text: "+s)
except sr.UnknownValueError:
print("Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Error {0}".format(e))