如何在python中识别从音频转换为文本的段落中的句子语音到文本

问题描述

这是我的代码

将speech_recognition导入为sr 导入操作系统

def Speech_to_text(speech_to_text_name):

#调用识别器() r = sr.Recognizer()

BASE_DIR = os.path.dirname(os.path.abspath(__file__))
# FILE_PATH = os.path.join(BASE_DIR,"noise_removed_lectures\\noise_removed_lectures_{}".format(speech_to_text_name))
FILE_PATH = os.path.join(BASE_DIR,"noise_removed_lectures\\{}".format(speech_to_text_name))
print('file path: ',FILE_PATH)
# DESTINATION_DIR = os.path.dirname(os.path.join(BASE_DIR,"LectureSummarizingApp\\speechToText\\{}.txt".format(speech_to_text_name)))
DESTINATION_DIR = os.path.join(BASE_DIR,"speechToText\\{}.txt".format(speech_to_text_name))
print('destination directory: ',DESTINATION_DIR)

with sr.AudioFile(FILE_PATH) as source:
    audio = r.listen(source)
    # file = open('audioToText01.txt','w') #open file
    file = open(DESTINATION_DIR,'w') #open file
    try:
        text = r.recognize_google(audio) #Convert using google recognizer
        file.write(text)
    except:
        file.write('error')

    file.close()

我也需要分开句子。我该怎么做??

解决方法

您可以使用带有分隔符的 split() 从您的字符串创建一个句子列表。

str = 'This is the first sentence. This is the second,and its a bit longer.'
sentences = str.split('. ') # Split the string at every dot followed by a space

print(sentences)

>> ['This is the first sentence','This is the second,and its a bit longer.']