使用 Python 脚本将 XML 导入 Orange3

问题描述

我的计算机中有一个 xml 文档,如下所示:

<?xml version="1.0" encoding=UTF-8"?>
<IPDatas xmlns:xsi="http://www.w3.org/...>
   <datas>
      <dna>
         <profile>
            <loci>
               <locus name="one">
                  <allele order="1">10</allele>
                  <allele order="2">12.3</allele>
               </locus>
               <locus name="two">
                  <allele order="1">11.1</allele>
                  <allele order="2">17</allele>
               </locus>
               <locus name="three">
                  <allele order="1">13.2</allele>
                  <allele order="2">12.3</allele>
               </locus>
            </loci>
         </profile>
      </dna>
   </datas>
</IPdatas> 

我想将文档导入 Orange 而不先将其转换到 Orange 之外,因此我可能需要使用“Python 脚本”小部件。导入后,我想把它转换成这样的表格:

one_1 one_2 two_1 two_2 three_1 three_2
10 12.3 11.1 17 13.2 12.3

我对 Python 的了解很差,所以任何建议将不胜感激!

解决方法

类似于以下内容:

import xml.etree.ElementTree as ET
import pprint


xml = '''
<IPDatas xmlns:xsi="http://www.w3.org/...">
   <datas>
      <dna>
         <profile>
            <loci>
               <locus name="one">
                  <allele order="1">10</allele>
                  <allele order="2">12.3</allele>
               </locus>
               <locus name="two">
                  <allele order="1">11.1</allele>
                  <allele order="2">17</allele>
               </locus>
               <locus name="three">
                  <allele order="1">13.2</allele>
                  <allele order="2">12.3</allele>
               </locus>
            </loci>
         </profile>
      </dna>
   </datas>
</IPDatas> '''

data = {}
root = ET.fromstring(xml)
locus_lst = root.findall('.//locus')
for locus in locus_lst:
    name = locus.attrib['name']
    allele_lst = locus.findall('allele')
    for allele in allele_lst:
        final_name = f"{name}_{allele.attrib['order']}"
        value = float(allele.text)
        data[final_name] = value
pprint.pprint(data)

输出(您应该能够与 Orange 一起使用的字典)

{'one_1': 10.0,'one_2': 12.3,'three_1': 13.2,'three_2': 12.3,'two_1': 11.1,'two_2': 17.0}

相关问答

错误1:Request method ‘DELETE‘ not supported 错误还原:...
错误1:启动docker镜像时报错:Error response from daemon:...
错误1:private field ‘xxx‘ is never assigned 按Alt...
报错如下,通过源不能下载,最后警告pip需升级版本 Requirem...