Python：如何使用项目中的开始/结束时间戳对齐两个列表

问题描述

我有两个列表，每个列表都按 6.7.0 排序，并且 start_time 不与其他项目重叠：

end_time

我想将 # (word,start_time,end_time) words = [('i',5.12,5.23),('like',5.24,5.36),('you',5.37,5.71),('really',7.21,7.51),('yes',8.32,8.54)] # (speaker,end_time) segments = [('spk1',0.0,1.25),('spk2',4.75,6.25),('spk1',6.75,7.75),8.25,9.25)] 中属于 words 中每个项目的 start_time 和 end_time 中的项目分组，并生成如下内容：

segments

使得 res = [('i','like','you'),('really'),('yes')] 中的每个项目都包含 res 的所有项目，其中 words 和 start_time 落在 end_time 和 start_time 之间end_time 中的相应项目。

解决方法

我在输入问题时想出了这个解决方案。我想 stackoverflow 是一个很好的橡皮鸭。但我很想知道是否有更省时的方法。

segments

单循环应该很快。

res = [                                         # initialize beforehand
    [
        seg[0],seg[1],seg[2],round(seg[2] - seg[1],2),'',# with empty speech
     ] for seg in segments
        ]
i = 0                                           # index of res
for word in words:                              # for each row of word
    if word[1] >= res[i][2]:                    # next speaker?
        i = i + 1                               # next res index
    if res[i][4]:                               # not empty speech
        res[i][4] = res[i][4] + ' ' + word[0]   # space in between
    else:                                       # empty speech
        res[i][4] = word[0]                     # initialize it

周日快乐！

alignment diarization parsing parsing python speech speech

Python：如何使用项目中的开始/结束时间戳对齐两个列表

问题描述

解决方法

相关问答