解析用于 Python 的异步 Amazon Transcribe Streaming SDK 返回的扬声器标签

问题描述

我正在创建一个服务来转录实时音频流。 Asynchronous Amazon Transcribe Streaming SDK for Python 提供了区分说话者的可能性。

show_speaker_label=True 参数传递到客户端配置后,每个单词的 API returns 说话者标签如下:

{
  "Transcript": {
    "Results": [
      {
        "Alternatives": [
          {
            "Items": [
              {
                "Confidence": 0.97,"Content": "From","EndTime": 18.98,"Speaker": "0","StartTime": 18.74,"Type": "pronunciation","VocabularyFilterMatch": false
              },{
                "Confidence": 1,"Content": "the","EndTime": 19.31,"StartTime": 19,"Content": "last","EndTime": 19.86,"StartTime": 19.32,...
              {
                "Confidence": 1,"Content": "chronic","EndTime": 22.55,"StartTime": 21.97,...
                "Confidence": 1,"Content": "fatigue","EndTime": 24.42,"StartTime": 23.95,{
                "EndTime": 25.22,"StartTime": 25.22,"Type": "speaker-change",{
                "Confidence": 0.99,"Content": "True","EndTime": 25.63,"Speaker": "1",{
                "Content": ".","StartTime": 25.63,"Type": "punctuation","VocabularyFilterMatch": false
              }
            ],"Transcript": "From the last note she still has mild sleep deprivation and chronic fatigue True."
          }
        ],"IsPartial": false,"ResultId": "XXXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXX","StartTime": 18.74
      }
    ]
  }
}

我想输出一个简单的逐行转录,其中包括每个句子的说话者标签,如下所示:

Speaker 1: Hello my name is Frank,what is yours?
Speaker 2: Hi,my name is Lucy. Nice to meet you.

但是,我不确定应用哪种策略来解析 API 响应。是否最好通过循环遍历项目并跟踪当前正在发言的人来解析结果。或者我应该遍历结果并等到遇到“speaker-change”类型的项目?

我已经在 Google 上搜索过示例,但我找到的解决方案要么有点混乱,要么适用于返回的 JSON 响应以进行批量转录。enter link description here

有人有正确解析这类结果的经验吗?您的意见会很有帮助。

解决方法

暂无找到可以解决该程序问题的有效方法,小编努力寻找整理中!

如果你已经找到好的解决方法,欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@)