Reading & Parsing JSON Data with Python: A Simple Tutorial

codes

codes

JavaScript Object Notation is a standard format mostly used by APIs and websites to store data objects using text. In simple words, JSON supports data structures that can represent objects as text. Also used in contemporary databases such as PostgreSQL, JSON is derived from JavaScript, as you might have already guessed.

Though XML and YAML serve the same purpose as JSON, JSON is simpler but not as powerful. Regardless, many developers hold the opinion that JSON made XML and YAML obsolete. 

While this is debatable, what’s interesting is that most programming languages, Python included, support JSON. This is mainly because it is so easy to use. 

If you’re ready to equip yourself with JSON skills to supplement your Python programming, here’s a brief guide demonstrating how you can read and parse JSON data with Python. 

Simple JSON Example

JSON is most notably used for transferring data objects over APIs. Let’s take a look at a simple JSON string:

{

  “name”: “France”,

  “population”: 65640482,

  “capital”: “Paris”,

  “languages”: [

    “French”

  ]

}

 

Look familiar? The example above looks just like a Python dictionary, right? 

That’s because JSON holds data just like a Python dictionary does – in key-value pairs. That said, JSON is a lot more flexible. JSON can hold numbers, strings, Boolean values, and even lists.  

You might have also noticed how lightweight JSON is. Unlike a similar language like XML, JSON has no markup involved. It simply holds the data that needs to be transferred. This characteristic of JSON is what makes it so popular. 

JSON in Python

Languages such as C, C++, C#, Ruby, Java, and Python, among many others, support JSON natively. So, in Python, you won’t need to install dependencies or anything else to use it. 

Python offers the ability to write customer encoders and decoders via the json module in its standard library. The module handles converting data from the JSON format to their Python equivalent objects, which include lists and dictionaries. Additionally, the module can convert Python objects to the JSON format. 

What You Need to Do Before Parsing JSON Data

JSON data is typically stored in strings, and you’re bound to store JSON data in string variables when working with APIs. Storing them this way is how they can be parsed.

Though JSON can do other things, it’s most frequently used to parse strings into Python dictionaries. The json module handles this task easily. As with using any other module in the Python standard library, you must begin by importing json. It comprises two very helpful methods: load and loads. 

Though the latter method looks like the plural form of the former, this is not the case. The “s” at the end of “loads” stands for “string.” This method enables the parsing of JSON data from strings.

In contrast, the load method is used to parse data that is in the byte format. Let’s see how loads() works first. 

Though you can use Python’s triple quote convention to store multi-line strings, it’s possible to remove the line breaks for better readability. Here’s how you’d create a JSON string in Python:

country = ‘{“name”: “France”, “population”: 65640482}’

print(type(country))

 

The second statement will print the variable we created like so:

<class ‘str’>

 

This output confirms that the output is a JSON string. Now, we can use loads() to supply the string as an argument:

import json

country = ‘{“name”: “France”, “population”: 65640482}’

country_dict = json.loads(country)

print(type(country))

print(type(country_dict))

 

The output of the final statement in the code above is “<class ‘dict’>” confirming that the JSON string in “country” has been converted into a dictionary. You can now use the dictionary like any other Python dictionary. 

Before you begin json.loads() in your code, note that the method doesn’t always return dictionaries. The data type it returns is determined by an input string. 

Let’s look at an example where the JSON string returns a list:

countries = ‘[“France”, “Ghana”]’

countries_list= json.loads(countries)

print(type(countries_list))

# This outputs <class ‘list’>

 

It is also possible for the JSON string to transform into a Boolean value if it has the right data. So, “true” in JSON converts to the Boolean “True.” Let’s see this in action:

import json

 

bool_string = ‘true’

bool_type = json.loads(bool_string)

print(bool_type)

# This outputs True

 

Here’s a table indicating JSON objects and their corresponding Python data types, so you don’t have to rely on writing your own examples to figure it out:

JSON Python
array list
false False
null None
number (integer) int
number (real) float
object dict
string str
true True

 

Parsing JSON Files in Python

So far, we’ve dealt with parsing JSON data in Python. Parsing JSON files into Python data involves a similar process and is equally simple. That said, parsing JSON files deamnds that you use the open() method besides JSON to do this. Additionally, you must now resort to using load() to read JSON data stored in files.  

The method accepts a JSON file object and returns a Python object. And to get the file object from its location, the open() method is used. Let’s see how this works – begin by storing the following JSON script in a file named “france.json”:

{

  “name”: “France”,

  “population”: 65640482,

  “capital”: “Paris”,

  “languages”: [

    “French”

  ]

}

 

Now, create another text file and put this Python script in it:

import json

with open(‘france.json’) as f:

  data = json.load(f)

print(type(data))

 

The open() method above returns a file handle which is then handed to the load method. The “data” variable holds the JSON data as a dictionary. Here’s what you can run to check the dictionary keys:

print(data.keys())

# The output of this code is dict_keys([‘name’, ‘population’, ‘capital’, ‘languages’])

 

Now that you have this information, you can print specific values like so:

data[‘population’]

# This outputs 65640482

 

JSON Encoding

JSON encoding, which is also called serialization, refers to the process of converting Python objects to JSON objects. It is done using the dumps() method. Here’s an example to explore how it works – save the following Python script as a new file:

import json

languages = [“French”]

country = {

    “name”: “France”,

    “population”: 65640482,

    “languages”: languages,

    “president”: Emmanuel Macron,

}

country_string = json.dumps(country)

print(country_string)

 

Run the code, and you’ll see the following output:

{“name”: “France”, “population”: 65640482, “languages”: [“French”],

 “president”: Emmanuel Macron}

 

As you can see, we’ve converted the Python object into a JSON object. Parsing a Python object into JSON data is surprisingly easy!

Bear in mind that we’ve used a dictionary object in this example, which is why it converted into a JSON object. As you might have guessed, you can also convert lists into JSON. Here’s how:

import json

languages = [“French”]

languages_string = json.dumps(languages)

print(languages_string)

# This outputs [“French”]

 

Of course, converting Python to JSON is not limited to dictionaries and lists. You can also convert strings, integers, Boolean values, and floating point numbers to JSON. 

Here’s a table to refer to when converting Python objects to JSON:

Python JSON
dict object
False false
int, float, int number
list, tuple array
None null
str string
True true

 

Writing Python Objects to JSON Files

You’ll need to rely on the dump() method to write Python objects to JSON files. It works similarly to dumps(), with the difference being that dump() can write to files while dumps() returns strings. 

In the example below, we open the file in writing mode and write JSON data to it:

import json

# Tuple is encoded to JSON array.

languages = (“French”)

# Dictionary is encoded to JSON object

country = {

    “name”: “France”,

    “population”: 65640482,

    “languages”: languages,

    “president”: Emmanuel Macron,

}

with open(‘countries_exported.json’, ‘w’) as f:

    json.dump(country, f)

 

Save the script to a new file and run it. The .json file mentioned in the penultimate line will be created or overwritten. The contents of this new file will be JSON. 

But when you open this new file, you will see that the JSON will be in one line. If you’re interested in enhancing readability, you can pass an additional parameter to dump() like so: 

json.dump(country, f, indent=4)

 

If you add this parameter to the file you saved and run it, you will see that the .json file will now be formatted neatly like this:

{

    “languages”: [

        “French”

    ], 

    “president”: “Emmanuel Macron”

    “name”: “France”

    “population”: 65640482

}

 

As you can see, everything is indented four spaces. It’s interesting to note that the indent argument is also available for dumps(). 

Difference Between Loading and Dumping in the json Module

The json module isn’t very complicated – it has only four primary methods. We’ve covered loads() and load() in this post. 

There are two more methods, dump() and dumps(), that allow you to work with JSON in Python. These functions allow you to write data to files and strings, respectively. Just like the “s” in the “loads” method stands for “string,” the “s” in “dumps” also stands for string. 

Remembering the meaning of the “s” is a great way to remember which method can be used for which task. 

Conclusion

So, with this guide handy, you’ve learned how to read and parse JSON data in Python. Learning to work with JSON using Python is essential if you work with websites since it’s used to transfer data virtually everywhere. Besides databases and APIs, JSON is used in web scrapers as well.

If you’re working on a dynamic website, chances are you might be trying to implement web scraping. With your knowledge of JSON, you’re ready to script pages with infinite scroll.