问题描述
我试着离开(星星只是在这里表明我要保留哪条线)
*2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:20:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:30:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:50:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
*2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
到
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
我想在不考虑日期的情况下从一组相同的行中删除重复的行,但我不知道如何去做。 我尝试从第二个参数进行排序,但它没有考虑行的“组”
cat <my file> | sort -t";" -k2 -u
给我那个
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
有人有想法吗?
解决方法
给定 -m 标志,sort
假定输入已经排序并且不再排序;这正是您在这里寻找的。p>
$ sort -m -t';' -k2 -u <file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 19:40:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);100.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
2020-12-15 20:00:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
,
这不是真正的排序。您正在尝试打印与前一行不同的行,而忽略日期时间列。你可以试试这个awk
:
awk -F ';' '{s=$0; sub(/^[^;]+;/,"",s)} p != s; {p=s}' file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm
,
告诉uniq
跳过日期字段:
$ uniq -s 20 file
2020-12-15 19:10:00;34min 8s (min: 32s; max 34min 8s);normal;4;2;91EECB0E;1us (-24);78.688ms (max: 5s);-50.923ms;103.234ms;25.637ms;70;-5,050ppm