python

超轻量级php框架startmvc

Python XML转Json之XML2Dict的使用方法

更新时间:2020-06-21 00:12:01 作者:startmvc
1.Json读写方法defparseFromFile(self,fname):"""OverwrittentoreadJSONfiles."""f=open(fname,"r")returnjson.load(f)defse

1. Json读写方法


def parseFromFile(self, fname):
 """
 Overwritten to read JSON files.
 """
 f = open(fname, "r")
 return json.load(f)


def serializeToFile(self, fname, annotations):
 """
 Overwritten to write JSON files.
 """
 f = open(fname, "w")
 json.dump(annotations, f, indent=4, separators=(',', ': '), sort_keys=True)
 f.write("\n")

2. xml文件的工具包XML2Dict

将xml转换成Python本地字典对象, 访问子元素和字典常用方法类似,略有不同, 使用 “.”

注: 使用xml2dict库,需要在本地项目添加 xml2dict.py, object_dict.py,下载链接

加载xml文件


from xml2dict import XML2Dict
xml = XML2Dict()
r = xml.parse("待处理文件名.xml") 

xml示例[voc2007格式]:


<annotation>
 <folder>VOC2007</folder>
 <filename>AL_00001.JPG</filename>
 <size>
 <width>800</width>
 <height>1160</height>
 <depth>3</depth>
 </size>
 <object>
 <name>l_faster</name>
 <pose>Unspecified</pose>
 <truncated>0</truncated>
 <difficult>0</difficult>
 <bndbox>
 <xmin>270</xmin>
 <ymin>376</ymin>
 <xmax>352</xmax>
 <ymax>503</ymax>
 </bndbox>
 </object>
 <object>
 <name>l_faster</name>
 <pose>Unspecified</pose>
 <truncated>0</truncated>
 <difficult>0</difficult>
 <bndbox>
 <xmin>262</xmin>
 <ymin>746</ymin>
 <xmax>355</xmax>
 <ymax>871</ymax>
 </bndbox>
 </object>
 <object>
 <name>r_faster</name>
 <pose>Unspecified</pose>
 <truncated>0</truncated>
 <difficult>0</difficult>
 <bndbox>
 <xmin>412</xmin>
 <ymin>376</ymin>
 <xmax>494</xmax>
 <ymax>486</ymax>
 </bndbox>
 </object>
 <object>
 <name>r_faster</name>
 <pose>Unspecified</pose>
 <truncated>0</truncated>
 <difficult>0</difficult>
 <bndbox>
 <xmin>411</xmin>
 <ymin>748</ymin>
 <xmax>493</xmax>
 <ymax>862</ymax>
 </bndbox>
 </object>
</annotation>

分析下这个文件的格式:

最外一层被<annotation></annotation>包围

往里一层是:<file_name></file_name>,<size></size>,<object></object>,其中object是列表,包括name和bndbox,示例访问annotation下级元素


# -*- coding: utf-8 -*-
from xml2dict import XML2Dict
xml = XML2Dict()
r = xml.parse('Annotations/AL_00001.xml')
for item in r.annotation:
 print item
print '------------'
for item in r.annotation.object:
 print item.name, item.bndbox.xmin, item.bndbox.xmax, item.bndbox.ymin, item.bndbox.ymax

执行结果:


object
folder
size
value
filename
------------
l_faster 270 352 376 503
l_faster 262 355 746 871
r_faster 412 494 376 486
r_faster 411 493 748 862

完整代码[xml2json]


# -*- coding: utf-8 -*-
from xml2dict import XML2Dict
import json
import glob


def serializeToFile(fname, annotations):
 """
 Overwritten to write JSON files.
 """
 f = open(fname, "w")
 json.dump(annotations, f, indent=4, separators=(',', ': '), sort_keys=True)
 f.write("\n")

def getAnnos(file_name="", prefix=''):
 xml = XML2Dict()
 root = xml.parse(file_name)
 # get a dict object
 anno = root.annotation
 image_name = anno.filename
 item = {'filename': prefix + image_name, 'class': 'image', 'annotations': []}

 for obj in anno.object:

 cls = {'l_faster': 'C1', 'r_faster': 'C2'}[obj.name]
 box = obj.bndbox
 x, y, width, height = int(box.xmin), int(box.ymin), int(box.xmax) - int(box.xmin), int(box.ymax) - int(box.ymin)
 item['annotations'] += [{
 "class": cls,
 "height": height,
 "width": width,
 "x": x,
 "y": y
 }]
 return item

if __name__ == '__main__':
 annotations = []
 anno_name = 'AR_001-550.json'
 files = glob.glob('Annotations/AR_*.xml')
 files = sorted(files)
 # print files.sort()
 for filename in files:
 item = getAnnos(filename, prefix='TFS/JPEGImages/')
 print item
 print '-----------------'
 annotations += [item] #"xmls/AL_00001.xml"
 serializeToFile(anno_name, annotations)


以上这篇Python XML转Json之XML2Dict的使用方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持脚本之家。

Python XML Json XML2Dict