问题描述
Salmonella_enterica_subsp_enterica_Infantis lcl|CP052796.1_prot_QJV25804.1_4153
...
而 file2 是:
...
Prot lcl|CP052796.1_prot_QJV25804.1_4153 98.701 100
...
Prot Salmonella_enterica_subsp_enterica_Infantis lcl|CP052796.1_prot_QJV25804.1_4153 98.701 100
我尝试使用 join -1 2 -2 2 -o 2.1 1.1 2.2 2.3 2.4 file1 file2
但 join 给出警告“join: file2: is not sorted”。
我尝试在这之前对这两个文件进行排序,例如 sort -k2,2 file1
,但它不起作用。对这种类型的链条进行排序有什么想法吗?
谢谢!
解决方法
如果 awk
是您的选择,请您试试:
awk '
# the following block processes File1
NR==FNR {
f1[$2] = $1 # associate the 1st field with 2nd field in File1
next
}
# the following block processes File2
f1[$2] { # if the 2nd field is found in File1
print $1,f1[$2],$2,$3,$4
}
' File1 File2
使用提供的示例输出:
Prot Salmonella_enterica_subsp_enterica_Infantis lcl|CP052796.1_prot_QJV25804.1_4153 98.701 100