Blog

ProjectBlog: The easy way to work with CSV, JSON, and XML in Python

Blog: The easy way to work with CSV, JSON, and XML in Python


Python’s superior flexibility and ease of use are what make it one of the most popular programming language, especially for Data Scientists. A big part of that is how simple it is to work with large datasets.

Every technology company today is building up a data strategy. They’ve all realised that having the right data: insightful, clean, and as much of it as possible, gives them a key competitive advantage. Data, if used effectively, can offer deep, beneath the surface insights that can’t be discovered anywhere else.

Over the years, the list of possible formats that you can store your data in has grown significantly. But, there are 3 that dominate in their everyday usage: CSV, JSON, and XML. In this article, I’m going to share with you the easiest ways to work with these 3 popular data formats in Python!

CSV Data

A CSV file is the most common way to store your data. You’ll find that most of the data coming from Kaggle competitions is stored in this way. We can do both read and write of a CSV using the built-in Python csv library. Usually, we’ll read the data into a list of lists.

Check out the code below. When we run csv.reader() all of our CSV data becomes accessible. The csvreader.next() function reads a single line from the CSV; every time you call it, it moves to the next line. We can also loop through every row of the csv using a for-loop as with for row in csvreader . Make sure that you have the same number of columns in each row, otherwise, you’ll likely end up running into some errors when working with your list of lists.

Writing to CSV in Python is just as easy. Set up your field names in a single list, and your data in a list of lists. This time we’ll create a writer() object and use it to write our data to file very similarly to how we did the reading.

Of course, installing the wonderful Pandas library will make working with your data far easier once you’ve read it into a variable. Reading from CSV is a single line as is writing it back to file!

We can even use Pandas to convert from CSV to a list of dictionaries with a quick one-liner. Once you have the data formatted as a list of dictionaries, we’ll use the dicttoxml library to convert it to XML format. We’ll also save it to file as a JSON!

JSON Data

JSON provides a clean and easily readable format because it maintains a dictionary-style structure. Just like CSV, Python has a built-in module for JSON that makes reading and writing super easy! When we read in the CSV, it will become a dictionary. We then write that dictionary to file.

And as we saw before, once we have our data you can easily convert to CSV via pandas or use the built-in Python CSV module. When converting to XML, the dicttoxml library is always our friend.

XML Data

XML is a bit of a different beast from CSV and JSON. Generally, CSV and JSON are widely used due to their simplicity. They’re both easy and fast to read, write, and interpret as a human. There’s no extra work involved and parsing a JSON or CSV is very lightweight.

XML on the other hand tends to be a bit heavier. You’re sending more data, which means you need more bandwidth, more storage space, and more run time. But XML does come with a few extra features over JSON and CSV: you can use namespaces to build and share standard structures, better representation for inheritance, and an industry standardised way of representing your data with XML schema, DTD, etc.

To read in the XML data, we’ll use Python’s built-in XML module with sub-module ElementTree. From there, we can convert the ElementTree object to a dictionary using the xmltodictlibrary. Once we have a dictionary, we can convert to CSV, JSON, or Pandas Dataframe like we saw above!


Like to learn?

Follow me on twitter where I post all about the latest and greatest AI, Technology, and Science! Connect with me on LinkedIn too!


Recommended Reading

Want to learn more about coding in Python? The Python Crash Course book is the best resource out there for learning how to code in Python!

And just a heads up, I support this blog with Amazon affiliate links to great books, because sharing great books helps everyone! As an Amazon Associate I earn from qualifying purchases.

Source: Artificial Intelligence on Medium

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top
a

Display your work in a bold & confident manner. Sometimes it’s easy for your creativity to stand out from the crowd.

Social