用于解析CSV到XML查询的Shell脚本?

我在 csv file中有一个引用列表,我想在 CrossRef用来填写基于XML的查询表

CrossRef提供了一个XML模板(下面,删除了未使用的字段),我想解析csv文件的列以填充查询标记中的重复字段

<?xml version = "1.0" encoding="UTF-8"?>
<query_batch xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0" xmlns="http://www.crossref.org/qschema/2.0"
  xsi:schemaLocation="http://www.crossref.org/qschema/2.0 http://www.crossref.org/qschema/crossref_query_input2.0.xsd">
<head>
   <email_address>[email protected]</email_address>
   <doi_batch_id>test</doi_batch_id>
</head>
<body>
  <query enable-multiple-hits="true"
            list-components="false"
            expanded-results="false" key="key">
    <article_title match="fuzzy"></article_title>
    <author search-all-authors="false"></author>
    <volume></volume>
    <year></year>
    <first_page></first_page>
    <journal_title></journal_title>
  </query>
</body>
</query_batch>

如何在shell脚本中完成?

样本输入:

author,year,article_title,journal_title,volume,first_page
Adler,2006,"Biomass yield and biofuel quality of switchgrass harvested in fall or spring","Agronomy Journal",98,1518
Alexopolou,2008,"Biomass yields for upland and lowland switchgrass varieties grown in the Mediterranean region","Biomass and Bioenergy",32,926
Balasko,1984,"Yield and Quality of Switchgrass Grown without Soil Amendments.",76,204

期望的输出

<?xml version = "1.0" encoding="UTF-8"?>
<query_batch xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0" xmlns="http://www.crossref.org/qschema/2.0"
  xsi:schemaLocation="http://www.crossref.org/qschema/2.0 http://www.crossref.org/qschema/crossref_query_input2.0.xsd">
<head>
   <email_address>[email protected]</email_address>
   <doi_batch_id>test</doi_batch_id>
</head>
<body>
 <query>
  <author>Adler</author >
  <year>2006</year >
  <article_title>Biomass yield and biofuel quality of switchgrass harvested in fall or spring</article_title >
  <journal_title>Agronomy Journal</journal_title >
  <volume>98</volume >
  <first_page>1518</first_page >
 </query>
 <query>
  <author>Alexopolou</author >
  <year>2008</year >
  <article_title>Biomass yields for upland and lowland switchgrass varieties grown in the Mediterranean region</article_title >
  <journal_title>Biomass and Bioenergy</journal_title >
  <volume>32</volume >
  <first_page>926</first_page >
 </query>
 <query>
  <author>Balasko</author >
  <year>1984</year >
  <article_title>Yield and Quality of Switchgrass Grown without Soil Amendments.</article_title >
  <journal_title>Agronomy Journal</journal_title >
  <volume>76</volume >
  <first_page>204</first_page >
 </query>
</body>

其他问题在C#Java提供了一些帮助

#!/usr/bin/awk -f
# XML Attributes Must be Quoted. Attribute values must always be quoted. Either single or double quotes can be used.

BEGIN{
    FS=","
    print "<?xml version = '1.0' encoding='UTF-8'?>"
    print "<query_batch xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' version='2.0' xmlns='http://www.crossref.org/qschema/2.0'"
    print "  xsi:schemaLocation='http://www.crossref.org/qschema/2.0 http://www.crossref.org/qschema/crossref_query_input2.0.xsd'>"
    print "<head>"
    print "   <email_address>[email protected]</email_address>"
    print "   <doi_batch_id>test</doi_batch_id>"
    print "</head>"
    print "<body>"
}

NR>1{
    print "  <query enable-multiple-hits='true'"
    print "            list-components='false'"
    print "            expanded-results='false' key='key'>"
    print "    <article_title match='fuzzy'>" $3 "</article_title>"
    print "    <author search-all-authors='false'>" $1 "</author>"
    print "    <volume>" $5 "</volume>"
    print "    <year>" $2 "</year>"
    print "    <first_page>" $6 "</first_page>"
    print "    <journal_title>" $4 "</journal_title>"
    print "  </query>"
}

END{
    print "</body>"
    print "</query_batch>"
}
$awk -f script.awk input.csv

相关文章

php输出xml格式字符串
J2ME Mobile 3D入门教程系列文章之一
XML轻松学习手册
XML入门的常见问题(一)
XML入门的常见问题(三)
XML轻松学习手册(2)XML概念