使用事件合成进行语音翻译执行 python 代码的问题

问题描述

from_language = 'fr-FR'
to_language = 'en-US'

def synthesis_callback(evt):
    size = len(evt.result.audio)
    print(f'Audio synthesized: {size} byte(s) {"(COMPLETED)" if size == 0 else ""}')

    if size > 0:
        file = open('translation.wav','wb+')
        file.write(evt.result.audio)
        file.close()

def translate_speech_to_text():

    translation_config = speechsdk.translation.SpeechTranslationConfig(
            subscription=speech_key,region=service_region)
    translation_config.speech_recognition_language = from_language
    translation_config.add_target_language(to_language)

    translation_config.voice_name = "en-US-JennyNeural"
    audio_input = speechsdk.AudioConfig(filename=filename)

    recognizer = speechsdk.translation.TranslationRecognizer(translation_config=translation_config,audio_config = audio_input )

    done = False

    def stop_cb(evt):
        """callback that stops continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        recognizer.stop_continuous_recognition()
        nonlocal done
        done = True

    # Connect callbacks to the events fired by the speech recognizer
    recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
    recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
    # stop continuous recognition on either session stopped or canceled events
    recognizer.session_stopped.connect(stop_cb)
    recognizer.canceled.connect(stop_cb)

    # Start continuous speech recognition
    recognizer.start_continuous_recognition()

    while not done:
        time.sleep(.5)
    
translate_speech_to_text()

问题描述:我正在尝试执行 Azure 认知语音服务文档中的事件合成示例 Python 代码。其他订阅 ID 和语言环境,我已将功能从 identify_once() 更改为 start_continuous_recognition() 以读取整个音频文件。但它不会生成任何输出 .wav 文件。我尝试过 start_continuous_recognition_async() 函数,但它也不起作用。

我想知道将音频从一种语言合成到另一种语言还需要什么吗?

解决方法

语音名称“en-US-Hedda”不存在。请使用此处提到的受支持的语音名称之一:https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support#text-to-speech