删除除包含给定字符串的第一行之外的所有内容 更新:

问题描述

我有一个名为 duration.log 的日志文件,其输出

2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:48.756914: --- DURATION: 0:00:03.000727 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:51.948027: --- DURATION: 0:00:03.068122 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:55.075158: --- DURATION: 0:00:02.987064 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:25:58.274715: --- DURATION: 0:00:03.063948 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:01.753367: --- DURATION: 0:00:03.273167 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:05.001949: --- DURATION: 0:00:03.050073 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-15 20:26:08.206065: --- DURATION: 0:00:03.073367 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 --- 

我使用 awk '!seen[$0]++' duration.log 删除了重复的行 How to delete duplicate lines in a file without sorting it in Unix?

现在,如何删除除包含字符串 210415-821-PK-JDoe 的第一行之外的所有内容? awk 或其他 bash 工具。

更新:

我正在寻找以下输出

2021-04-15 20:25:45.639181: --- DURATION: 0:00:02.928309 --- ROUTE NAME: 210415-821-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:03:24.188722: --- DURATION: 0:00:21.995238 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:03:50.434883: --- DURATION: 0:00:26.140902 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:04:18.552286: --- DURATION: 0:00:27.793468 --- ROUTE NAME: None --- HEADLESS: 0 --- 
2021-04-16 09:06:27.015632: --- DURATION: 0:00:33.688867 --- ROUTE NAME: 210416-829-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:10.487733: --- DURATION: 0:00:42.421573 --- ROUTE NAME: 210416-830-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:07:39.247244: --- DURATION: 0:00:28.391001 --- ROUTE NAME: 210416-831-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:06.292683: --- DURATION: 0:00:26.790946 --- ROUTE NAME: 210416-832-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:29.929427: --- DURATION: 0:00:19.462734 --- ROUTE NAME: 210416-833-PK-JDoe --- HEADLESS: 0 --- 
2021-04-16 09:08:53.306396: --- DURATION: 0:00:23.140965 --- ROUTE NAME: 210416-834-PK-JDoe --- HEADLESS: 0 --- 

解决方法

您能否根据您显示的示例尝试以下操作。

awk '/210415-821-PK-JDoe/ && ++count>1{next} 1'  Input_file

或者根据 Sundeep 上面的评论可以写成:

awk '!/210415-821-PK-JDoe/ || !count++'  Input_file

说明:为以上添加详细说明。

awk '                                ##Starting awk program from here.
/210415-821-PK-JDoe/ && ++count>1{   ##checking condition if line contains 210415-821-PK-JDoe AND count is greater than 1 then do following.
  next                               ##next will skip all further statements from here.
}
1                                    ##1 will print current line here.
'  Input_file                        ##Mentioning Input_file name here.