如何在不实际向 Azure Speech API 发送数据的情况下使用 pytest 模拟转录结果

问题描述

我在想出一个 pytest 函数来测试下面的函数时遇到问题:

有没有办法模拟任何函数/对象,这样我就可以在每次运行测试时伪造一致的转录结果,而不将任何数据发送到实际的 Azure Speech 服务(因此它不会被计费)?

def transcribe_azure_speech_file(speech_file):
    speech_config = speechsdk.SpeechConfig(subscription=SUBSCRIPTION_KEY,region=REGION)
    audio_input = speechsdk.AudioConfig(filename=speech_file)
    auto_source_lang_config = speechsdk.AutoDetectSourceLanguageConfig(languages=["en-US"])
    speech_recognizer = speechsdk.SpeechRecognizer(
        speech_config=speech_config,audio_config=audio_input,auto_detect_source_language_config=auto_source_lang_config
    )

    transcripts = []
    done = False

    def _stop_cb(evt):
        """callback that signals to stop continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        nonlocal done
        done = True

    def _recognized_event(evt):
        result = evt.result
        start_time = result.offset / 10e6
        end_time = start_time + result.duration / 10e6

        transcripts.append(
            {
                "transcript": result.text,"start_time": start_time,"end_time": end_time,}
        )

    # connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognized.connect(_recognized_event)
    # stop continuous recognition on either session stopped or canceled events
    speech_recognizer.session_stopped.connect(_stop_cb)
    speech_recognizer.canceled.connect(_stop_cb)

    # start continuous speech recognition
    speech_recognizer.start_continuous_recognition()
    while not done:
        time.sleep(0.25)

    speech_recognizer.stop_continuous_recognition()

    return transcripts

解决方法

如果您不想因调用 Azure Speech API 而被收费,只需创建一个免费套餐服务(F0 计划):

enter image description here