如何从Python列表中提取几个时间戳对

问题描述

我从笔录文件中提取了所有时间戳。输出看起来像这样：

('[,00:00:03,950,00:00:06,840,00:00:09,180,'
 '00:00:10,830,00:00:10,00:00:14,070,00:00:16,890,'
 '00:00:16,00:00:19,080,00:00:21,590,'
 '00:00:24,030,00:00:24,00:00:26,910,00:00:29,640,'
 '00:00:29,00:00:31,920,00:00:35,850,'
 '00:00:38,629,00:00:38,00:00:40,859,00:00:43,170,'
 '00:00:43,00:00:45,570,00:00:48,'
 '00:00:52,019,00:00:52,00:00:54,449,00:00:57,210,'
 '00:00:57,00:00:59,519,00:01:02,690,'
 '00:01:05,820,00:01:05,00:01:08,549,00:01:10,490,'
 '00:01:10,00:01:13,409,00:01:16,'
 '00:01:18,149,00:01:18,00:01:20,340,00:01:22,649,'
 '00:01:22,00:01:26,159,00:01:28,740,'
 '00:01:30,810,00:01:30,00:01:33,719,00:01:36,990,'
 '00:01:36,00:01:39,119,00:01:41,759,'
 '00:01:43,799,00:01:43,00:01:46,619,00:01:49,140,'
 '00:01:49,00:01:51,240,00:01:53,'
 '00:01:56,460,00:01:56,00:01:58,00:02:01,'
 '00:02:01,00:02:04,00:02:07,229,'
 '00:02:09,380,00:02:09,00:02:12,060,00:02:14,]')

在此输出中，总是存在时间戳记对，即，始终有2个连续的时间戳记在一起，例如：00:00:03,950和00:00:06,840，00:00:06,840和00:00:09,180等。

现在，我想分别提取所有这些时间戳对，以便输出看起来像这样：

00:00:03,950 - 00:00:06,840

00:00:06,840 - 00:00:09,180

00:00:09,180 - 00:00:10,830

等

目前，对于我的问题，我有以下解决方案（非常不便）：

# get first part of first timestamp
a = res_timestamps[2:15]
print(dedent(a))

# get second part of first timestamp
b = res_timestamps[17:29]
print(b)

# combine timestamp parts
c = a + ' - ' + b
print(dedent(c))

当然，这非常糟糕，因为我无法手动提取所有成绩单的索引。尝试使用循环尚未成功，因为每个项目都不是时间戳，而是单个字符。

我的问题有解决方案吗？

感谢您的帮助或提示。

非常感谢您！

解决方法

正则表达式进行救援！

一种完美适合您的示例数据的解决方案：

import re
from pprint import pprint

pprint(re.findall(r"(\d{2}:\d{2}:\d{2},\d{3}),(\d{2}:\d{2}:\d{2},\d{3})",your_data))

此打印：

[('00:00:03,950','00:00:06,840'),('00:00:06,840','00:00:09,180'),('00:00:09,180','00:00:10,830'),('00:00:10,830','00:00:14,070'),('00:00:14,070','00:00:16,890'),('00:00:16,890','00:00:19,080'),('00:00:19,080','00:00:21,590'),('00:00:21,590','00:00:24,030'),('00:00:24,030','00:00:26,910'),('00:00:26,910','00:00:29,640'),('00:00:29,640','00:00:31,920'),('00:00:31,920','00:00:35,850'),('00:00:35,850','00:00:38,629'),('00:00:38,629','00:00:40,859'),('00:00:40,859','00:00:43,170'),('00:00:43,170','00:00:45,570'),('00:00:45,570','00:00:48,('00:00:48,'00:00:52,019'),('00:00:52,019','00:00:54,449'),('00:00:54,449','00:00:57,210'),('00:00:57,210','00:00:59,519'),('00:00:59,519','00:01:02,690'),('00:01:02,690','00:01:05,820'),('00:01:05,820','00:01:08,549'),('00:01:08,549','00:01:10,490'),('00:01:10,490','00:01:13,409'),('00:01:13,409','00:01:16,('00:01:16,'00:01:18,149'),('00:01:18,149','00:01:20,340'),('00:01:20,340','00:01:22,649'),('00:01:22,649','00:01:26,159'),('00:01:26,159','00:01:28,740'),('00:01:28,740','00:01:30,810'),('00:01:30,810','00:01:33,719'),('00:01:33,719','00:01:36,990'),('00:01:36,990','00:01:39,119'),('00:01:39,119','00:01:41,759'),('00:01:41,759','00:01:43,799'),('00:01:43,799','00:01:46,619'),('00:01:46,619','00:01:49,140'),('00:01:49,140','00:01:51,240'),('00:01:51,240','00:01:53,('00:01:53,'00:01:56,460'),('00:01:56,460','00:01:58,('00:01:58,'00:02:01,('00:02:01,'00:02:04,('00:02:04,'00:02:07,229'),('00:02:07,229','00:02:09,380'),('00:02:09,380','00:02:12,060'),('00:02:12,060','00:02:14,840')]

您可以以所需的格式输出此格式，

for start,end in timestamps:
    print(f"{start} - {end}")

这是没有正则表达式的解决方案
清理字符串，并在','上拆分以创建列表
使用字符串切片选择奇数和偶数值并将它们压缩在一起。

# give data as your string

# convert data into a list by removing end brackets and spaces,and splitting
data = data.replace('[,','').replace(',]','').split(',')

# use list slicing and zip the two components
combinations = list(zip(data[::2],data[1::2]))

# print the first 5
print(combinations[:5])
[out]:

[('00:00:03,890')]

extract indexing python-3.x