使用 spech_recognition 记下系统音频

问题描述

我正在尝试使用 Python speech_recognition 从 系统音频 获取输入，然后将其打印为输出。不幸的是，我在设备列表方面遇到了一些问题。实际上，speech_recognition 似乎只将麦克风识别为输入设备。

我的想法如下：我记下重要视频通话的笔记很慢，所以我想让 Python 语音识别为我写下来，这样我就可以赶上丢失的片段。你认为有可能吗？怎么样？

这是我到现在为止的代码：

import pyaudio
import speech_recognition as sr

r=sr.Recognizer()
r.energy_threshold=4000

for index,name in enumerate(sr.Microphone.list_microphone_names()):
    print("Microphone with name \"{1}\" found for `Microphone(device_index={0})`".format(index,name))

# returns a list of 15 devices (microphone,system speakers,headphones...)

with sr.Microphone(device_index=4) as source:
   audio = r.listen(source)
# index = 4 is my headphones 

try:
   print("Speech was:" + r.recognize_google(audio))
except LookupError:
   print('Speech not understood')

刚看的时候觉得还不错。但是在运行时，它无法将我的耳机（系统音频）识别为设备并返回以下错误：

Traceback (most recent call last):
  File "...",line 10,in <module>
    with sr.Microphone(device_index=4) as source:
  File "...\speech_recognition\__init__.py",line 141,in __enter__
    input=True,# stream is an input stream
  File "...pyaudio.py",line 750,in open
    stream = Stream(self,*args,**kwargs)
  File "...pyaudio.py",line 441,in __init__
    self._stream = pa.open(**arguments)
OSError: [Errno -9998] Invalid number of channels

并将“普通麦克风”作为输入时：

line 858,in recognize_google
    if not isinstance(actual_result,dict) or len(actual_result.get("alternative",[])) == 0: raise UnkNownValueError()
speech_recognition.UnkNownValueError

你能帮我解决这个问题吗？

解决方法

当音频的结果不被理解为文字时，会出现麦克风错误。您可以通过在带有“except [导入的名称].UnknownValueError 的语句周围添加 try/except 来解决该问题。如果遇到“UnknownValueError”错误，这将导致在 except 下放置的任何内容都将运行。至于耳机，我认为问题在于您使用的是输出设备来实现输入功能。

google-speech-api live-streaming pyaudio python speech-recognition

使用 spech_recognition 记下系统音频

问题描述

解决方法

相关问答