问题描述
我是python的新手,我想编写python代码以从原始XML文件提取一些数据并写入新文件。我原始的xml文件就是这样。
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Header/>
<soapenv:Body>
<SessionID xmlns="http://www.niku.com/xog">12345</SessionID>
<QueryResult xmlns="http://www.niku.com/xog/Query" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Records>
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne,Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno,Jean</name>
</Record>
</Records>
</QueryResult>
</soapenv:Body>
</soapenv:Envelope>
<Records>
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne,Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno,Jean</name>
</Record>
</Records>
我能够从此代码中获得以下结果。
import xml.etree.ElementTree as ET
tree = ET.parse('my_file.xml')
root = tree.getroot()
for xtag in root.findall('.//{http://www.niku.com/xog/Query}Record'):
print(xtag)
结果:
<Element '{http://www.niku.com/xog/Query}Record' at 0x00000216BA69B778>
<Element '{http://www.niku.com/xog/Query}Record' at 0x00000216BA6A3228>
有人可以帮助我完成我的要求吗?
解决方法
在您的情况下,print(xtag)
打印xtag
对象而不是字符串。为此,您需要使用树的tostring()
方法将对象转换为字符串。同样,您似乎希望获得整个<Records>
块而不是单个<Record>
元素;为此,您不需要循环。
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
records = root.find('.//{http://www.niku.com/xog/Query}Records')
print(ET.tostring(records).decode("utf-8"))
输出
<ns0:Records xmlns:ns0="http://www.niku.com/xog/Query">
<ns0:Record>
<ns0:id>1</ns0:id>
<ns0:date_start>2020-10-04T00:00:00</ns0:date_start>
<ns0:date_end>2020-10-10T00:00:00</ns0:date_end>
<ns0:name>Payne,Max</ns0:name>
</ns0:Record>
<ns0:Record>
<ns0:id>2</ns0:id>
<ns0:date_start>2020-10-04T00:00:00</ns0:date_start>
<ns0:date_end>2020-10-10T00:00:00</ns0:date_end>
<ns0:name>Reno,Jean</ns0:name>
</ns0:Record>
</ns0:Records>
您还可以使用lxml
模块,该模块的输出会稍有不同。
from lxml import etree
tree = etree.parse('test.xml')
root = tree.getroot()
records = root.find('.//{http://www.niku.com/xog/Query}Records')
print(etree.tostring(records).decode("utf-8"))
输出
<Records xmlns="http://www.niku.com/xog/Query" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<Record>
<id>1</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Payne,Max</name>
</Record>
<Record>
<id>2</id>
<date_start>2020-10-04T00:00:00</date_start>
<date_end>2020-10-10T00:00:00</date_end>
<name>Reno,Jean</name>
</Record>
</Records>