问题描述
我有一个名为 duration.log
的日志文件,其输出如
2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:48.756914: --- DURATION: 0:00:03.000727 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:51.948027: --- DURATION: 0:00:03.068122 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:55.075158: --- DURATION: 0:00:02.987064 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:25:58.274715: --- DURATION: 0:00:03.063948 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:01.753367: --- DURATION: 0:00:03.273167 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:05.001949: --- DURATION: 0:00:03.050073 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-15 20:26:08.206065: --- DURATION: 0:00:03.073367 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 ---
我使用 awk '!seen[$0]++' duration.log
删除了重复的行
How to delete duplicate lines in a file without sorting it in Unix?。
现在,如何删除除包含字符串 210415-821-PK-JDoe
的第一行之外的所有内容? awk 或其他 bash 工具。
更新:
我正在寻找以下输出:
2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 ---
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 ---
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 ---
解决方法
您能否根据您显示的示例尝试以下操作。
awk '/210415-821-PK-JDoe/ && ++count>1{next} 1' Input_file
或者根据 Sundeep 上面的评论可以写成:
awk '!/210415-821-PK-JDoe/ || !count++' Input_file
说明:为以上添加详细说明。
awk ' ##Starting awk program from here.
/210415-821-PK-JDoe/ && ++count>1{ ##checking condition if line contains 210415-821-PK-JDoe AND count is greater than 1 then do following.
next ##next will skip all further statements from here.
}
1 ##1 will print current line here.
' Input_file ##Mentioning Input_file name here.