当前位置: 代码迷 >> python >> 从python中的xml文件解析数据
  详细解决方案

从python中的xml文件解析数据

热度:19   发布时间:2023-06-16 10:21:34.0

我有一个xml文件:

<swissprot created="2010-12-20">
 <entrylevel dataset="abc">
    <references id="1">
        <title>first references</title>
        <author>
            <person name="Mr. A"/>
            <person name="Mr. B"/>
            <person name="Mr. C"/>
        </author>
        <score> score1 for id 1 </score>
        <score> score2 for id 1 </score>
        <score> score3 for id 1 </score>
    </references>
    <references id="2">
        <title>Second references</title>
        <author>
            <person name="Mr. D"/>
            <person name="Mr. E"/>
            <person name="Mr. F"/>
        </author>
        <score> score1 for id 2 </score>
        <score> score2 for id 2 </score>
        <score> score3 for id 2 </score>
    </references>
    <references id="3">
        <title>third references</title>
        <author>
            <person name="Mr. G"/>
            <person name="Mr. H"/>
            <person name="Mr. I"/>
        </author>
        <score> score1 for id 3 </score>
        <score> score2 for id 3 </score>
        <score> score3 for id 3 </score>
    </references>
    <references id="4">
        <title>fourth references</title>
        <author>
            <person name="Mr. J"/>
            <person name="Mr. K"/>
            <person name="Mr. L"/>
        </author>
        <score> score 1 for id 4 </score>
        <score> score 2 for id 4 </score>
        <score> score 3 for id 4 </score>
    </references>
  </entrylevel>
</swissprot>  

我希望以特定格式来自此xml的所有引用:输出:

First Reference
Mr A, Mr B, Mr C
score 1 for id 1, score 2 for id 1, score 3 for id 1

Second Reference
Mr D, Mr E, Mr F
score 1 for id 2, score 2 for id 2, score 3 for id 2

Third Reference
Mr G, Mr H, Mr I
score 1 for id 3, score 2 for id 3, score 3 for id 3

Fourth Reference
Mr J, Mr K, Mr L
score 1 for id 4, score 2 for id 4, score 3 for id 4

我编写了代码,我能够以正确的格式获得title的值,但是我无法获得专门针对每个条目的作者信息。

import xml.etree.ElementTree as ET
document = ET.parse("recipe.xml")
root = document.getroot()
title=[]
author=[]
score=[]  

for i in root.getiterator('title'):
     title.append(i.text)
     for j in root.getiterator('author'):
          author.append(j.text)
           for k in root.getiterator('score'):
                score.append(k.text) 

for i,j,k in zip(title,author,score):
      print i,j,k

遍历references而不是title

根据需要修改以下代码!

import xml.etree.ElementTree as ET
document = ET.parse("recipe.xml")
root = document.getroot()
TITLE = 0

for child in root.getiterator('references'):
    author=[]
    score=[] 
    for k in child.getiterator('person'):
        author.append(k.get('name'))
    for l in child.getiterator('score'):
        score.append(l.text)

    print child[TITLE].text
    print ', '.join(author)
    print ', '.join(score)

输出:

first references
Mr. A, Mr. B, Mr. C
 score1 for id 1 ,  score2 for id 1 ,  score3 for id 1 
Second references
Mr. D, Mr. E, Mr. F
 score1 for id 2 ,  score2 for id 2 ,  score3 for id 2 
third references
Mr. G, Mr. H, Mr. I
 score1 for id 3 ,  score2 for id 3 ,  score3 for id 3 
fourth references
Mr. J, Mr. K, Mr. L
 score 1 for id 4 ,  score 2 for id 4 ,  score 3 for id 4

阅读