问题描述
我想通过“entry”标签的属性“value”对下面的xml进行排序,并在数字之前对字符串(字母)进行排序。
<test>
<entry value="-12" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
</test>
我写了一些 Python 来对这个 xml 进行排序,但它首先对数字进行排序,然后对字符串进行排序。 我已经检查了这个 thread,但无法实现任何排序 XML 的解决方案。
import xml.etree.ElementTree as ElT
import os
from os.path import sep
def sort_xml(directory,xml_file,level1_tag,attribute,mode=0):
#mode 0 - numbers before letters
#mode 1 - letters before numbers
file = directory + sep + xml_file
tree = ElT.parse(file)
data = tree.getroot()
els = data.findall(level1_tag)
if mode == 0:
new_els = sorted(els,key=lambda e: (e.tag,e.attrib[attribute]))
if mode == 1:
new_els = sorted(els,key=lambda e: (isinstance(e.tag,(float,int)),e.attrib[attribute]))
for el in new_els:
if mode == 0:
el[:] = sorted(el,e.attrib[attribute]))
if mode == 1:
el[:] = sorted(el,e.attrib[attribute]))
data[:] = new_els
tree.write(file,xml_declaration=True,encoding='utf-8')
with open(file,'r') as fin:
data = fin.read().splitlines(True)
with open(file,'w') as fout:
fout.writelines(data[1:])
sort_xml(os.getcwd(),"test.xml","entry","value",1)
知道如何做到这一点吗?
Edit1:所需的输出
<test>
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
<entry value="-12" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
</test>
解决方法
我把字母开始的部分放在最上面。这是顶部有字母的实际要求,我不关心其余的。
下面
import xml.etree.ElementTree as ET
xml = '''<test>
<entry value="-12" />
<entry value="/this" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
</test>'''
root = ET.fromstring(xml)
numeric = []
non_numeric = []
for entry in root.findall('.//entry'):
try:
x = int(entry.attrib['value'])
numeric.append((x,entry.attrib['value']))
except ValueError as e:
non_numeric.append(entry.attrib['value'])
sorted(numeric,key=lambda x: x[0])
sorted(non_numeric)
root = ET.Element('test')
for value in non_numeric:
entry = ET.SubElement(root,'entry')
entry.attrib['value'] = value
for value in numeric:
entry = ET.SubElement(root,'entry')
entry.attrib['value'] = str(value[1])
ET.dump(root)
输出
<?xml version="1.0" encoding="UTF-8"?>
<test>
<entry value="/this" />
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
<entry value="-12" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
</test>
,
我认为您的问题是在排序时您正在检查值是 int
还是 float
。事实上,所有的值都是字符串,例如isinstance(e.tag,(float,int))
将始终为假。
这样的排序功能可以满足您的需求
def sorter(x):
"Check if the value can be interpreted as an integer,then by the string"
value = x.get("value")
def is_integer(i):
try:
int(i)
except ValueError:
return False
return True
return is_integer(value),value
可以这样使用(使用 StringIO
作为文件的替代品)
from xml.etree import ElementTree
from io import StringIO
xml = """<test>
<entry value="-12" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
</test>"""
tree = ElementTree.parse(StringIO(xml))
root = tree.getroot()
root[:] = sorted(root,key=sorter)
tree.write("output.xml")
output.xml
的内容是
<test>
<entry value="_null" />
<entry value="abc" />
<entry value="abcd" />
<entry value="empty" />
<entry value="false" />
<entry value="test1" />
<entry value="test2" />
<entry value="true" />
<entry value="-12" />
<entry value="0" />
<entry value="043" />
<entry value="14" />
<entry value="6" />
</test>