Posted by Marta on October 9, 2021 Viewed 98 times
Learn to convert XML to CSV in Python using three different Python modules: xml.etree, xmltodict and untangle. Find out which one is more efficient.
In this article, you will learn how you can convert an XML to CSV in Python. I will show you how to convert from XML to CSV in different ways using different libraries.
XML is a format that is slowly fading out and being replaced by JSON, however it is still present and used mainly in old platforms.
Converting XML to CSV generally involves the following steps:
In this tutorial, we will work with a file containing employees, where each employee has a name, a role and age. See xml file below:
<employees> <employee> <name>Dave</name> <role>Sale Assistant</role> <age>34</age> </employee> <employee> <name>Robert</name> <role>Product Expert</role> <age>38</age> </employee> <employee> <name>Jane</name> <role>Marketing Manager</role> <age>31</age> </employee> </employees>
Let see the details to code this task using three different Python modules. Let’s get started.
Firstly we will see an approach for converting XML to CSV which doesn’t require you to install any extra libraries. Python includes a module to manipulate XML called
xml.etree.ElementTree. This module will allow you to easily parse and create XML files. See official documentation here.
The code below will import the xml data, extract the employees details( name, role and age) and create a row in the CSV file for each employee. To do so we will use three essential methods:
ElementTree.parse("<REPLACE BY XML FILE PATH">): to import the xml data into your program.
.findall("<REPLACE_BY_XML_TAG>"): it will find all elements with a given tag that are direct children of the current element.
.find("<REPLACE_BY_XML_TAG>"): This method will find the first child element with the a given tag.
from xml.etree import ElementTree import csv # PARSE XML xml = ElementTree.parse("data.xml") # CREATE CSV FILE csvfile = open("data.csv",'w',encoding='utf-8') csvfile_writer = csv.writer(csvfile) # ADD THE HEADER TO CSV FILE csvfile_writer.writerow(["name","role","age"]) # FOR EACH EMPLOYEE for employee in xml.findall("employee"): if(employee): # EXTRACT EMPLOYEE DETAILS name = employee.find("name") role = employee.find("role") age = employee.find("age") csv_line = [name.text, role.text, age.text] # ADD A NEW ROW TO CSV FILE csvfile_writer.writerow(csv_line)
The output is a file called
data.csv, generated in the same folder as your python program, containing the following:
name,role,age Dave,Sale Assistant,34 Robert,Product Expert,38 Jane,Marketing Manager,31
Another way of working with XML is converting the XML file to a Python object. This will allow you to avoid all xml processing to search for element through the XML file. The code below will do exactly the same as the code above, but using the
xmltodict module to parse the xml, instead of
xmltodict is a third-party module, meaning you will need to install it separately. To do so, run the following command from your terminal:
>> pip install xmltodict
xmltodict offers a
.parse() which will read the xml and then convert it to an ordered dictionary so the data is easily accesible.
import xmltodict, csv # PARSE XML FILE with open("data.xml") as xmlfile: xml = xmltodict.parse(xmlfile.read()) # CREATE CSV FILE csvfile = open("data2.csv",'w',encoding='utf-8') csvfile_writer = csv.writer(csvfile) # ADD HEADER csvfile_writer.writerow(["name","role","age"]) #FOR EACH EMPLOYEE for employee in xml["employees"]["employee"]: # EXTRACT EMPLOYEE DETAILS csv_line = [employee["name"], employee["role"],employee["age"]] # ADD A NEW ROW TO CSV FILE csvfile_writer.writerow(csv_line)
The output of the above code snippet is a file called
data2.csv containing the employees information.
Another useful thing to know about the
xmltodict module is that it allows you to convert a dictionary to xml.
Lastly you can convert XML to CSV using a Python module called untangle. Similarly to the
untangle module parses an XMl document and make the data easily accessible.
untangle is a third-party module as well, therefore you will need to install it separetaly usign pip as follows:
>> pip install untangle
Below there is a code snippet that will import the content of an XML file, create a CSV file, and then extract the data in the XML file and write in the CSV, using the
import untangle, csv # PARSE XML FILE xml = untangle.parse("data.xml") # CREATE CSV FILE csvfile = open("data3.csv",'w',encoding='utf-8') csvfile_writer = csv.writer(csvfile) # ADD HEADER csvfile_writer.writerow(["name","role","age"]) # FOR EACH EMPLOYEE for employee in xml.employees.employee: # EXTRACT EMPLOYEE DETAILS csv_line = [employee.name.cdata, employee.role.cdata, employee.age.cdata] # ADD NEW ROW TO CSV FILE csvfile_writer.writerow(csv_line)
Since the approaches to convert xml and csv are quite similar, which one should you choose? Choosing the fastest is always a good idea.
To compare these three modules, I have created an XML file with 600 employees and get the running time in milliseconds for each module. See the results below:
As you can see the
xml.tree module is at least twice faster than the
xmltodict module, and much faster than the
To summarise, there are a few modules available in Python that allow you to convert XML to CSV. Which one you choose depends on what you are trying to achieve. If you are only converting XML to CSV and I will recommend to use
xml.etree built-in python module which provides the best performance. However, if you need to also create an XML file, then the
xmltodict is a really handy tool.
I hope this article was useful and thank you for reading, and supporting this blog. Happy coding!
Steady pace book with lots of worked examples. Starting with the basics, and moving to projects, data visualisation, and web applications
Unique lay-out and teaching programming style helping new concepts stick in your memory
Great guide for those who want to improve their skills when writing python code. Easy to understand. Many practical examples
Perfect Boook for anyone who has an alright knowledge of Java and wants to take it to the next level.
Excellent read for anyone who already know how to program and want to learn Best Practices
Perfect book for anyone transitioning into the mid/mid-senior developer level
Great book and probably the best way to practice for interview. Some really good information on how to perform an interview. Code Example in Java