无法从发布的xml文件中读取所有抽象文本

问题描述

我下载了PubMed XML文件，并想打印出此文件中的所有文章，这是我的代码

import xml.etree.ElementTree as ET
tree = ET.parse('test1.xml')
root = tree.getroot()
for abs_1 in root.findall("PubmedArticle/MedlineCitation/Article/Abstract"):
    abs_2 = abs_1.find('AbstractText').text
    print(abs_2)

但是，我只得到摘要的客观部分。标记为<AbstractText Label="aim" NlmCategory="OBJECTIVE">，我没有得到另外两部分也位于<Abstract>内。

forxample XML得到了类似的东西

<Abstract>
<AbstractText Label="aim" NlmCategory="OBJECTIVE">The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.</AbstractText>
<AbstractText Label="METHODS AND RESULT" NlmCategory="RESULTS">Embase,PubMed,and Google Scholar were searched until march 2020 for relevant papers. Sixteen English papers were enrolled to answer questions about procedures that are allowed to perform during the COVID-19 outbreak,patients who are in priority to receive dental care services,the conditions and necessities for patient admission,waiting room and operatory room,and personal protective equipment (PPE) that is necessary for dental clinicians and the office staff.</AbstractText>
<AbstractText Label="CONCLUSION" NlmCategory="CONCLUSIONS">Dental treatment should be limited to patients with urgent or emergency situation. By screening questionnaires for COVID-19,patients are divided into three groups of (a) apparently healthy,(b) SUSPECTed for COVID-19,and (c) confirmed for COVID-19. Separate waiting and operating rooms should be assigned to each group of patients to minimize the risk of disease transmission. All groups should be treated with the same protective measures with regard to PPE for the dental clinicians and staff.</AbstractText>
<copyright@R_665_4045@ion>© 2020 Special Care Dentistry Association and Wiley Periodicals,Inc.</copyright@R_665_4045@ion>
</Abstract>

使用我的代码我只会得到

The level of preparedness of the healthcare system plays an important role in management of coronavirus disease 2019 (COVID-19). This study attempted to devise a comprehensive protocol regarding dental care during the COVID-19 outbreak.

真的需要一些有关如何打印出摘要中所有abstracttext的帮助

解决方法

当您可以.findall() <Abstract>个元素时，以相同的方式可以.findall() <AbstractText>个元素是不合逻辑的吗？

import xml.etree.ElementTree as ET

tree = ET.parse('test1.xml')
root = tree.getroot()

for AbstractText in root.findall("PubmedArticle/MedlineCitation/Article/Abstract/AbstractText"):
    print(AbstractText.text)

element element pubmed python xml xml xml xml xml xml