The popularity of JSON is due in large part to its use of universal data structures such as objects and arrays that are supported in one form or another by the majority of programming languages. This post looks at the basic structure of JSON-encoded data, how that structure relates to Python data structures and using Python to reliably access JSON-encoded data.
Using The Python Code Examples
All the Python code examples in this post were tested using Python 3.8.2 (macOS) and Python 3.6.9 (Ubuntu) and can be run using the Python interpreter.
The examples assume the file macos.json
is located in your home directory: /User/username
(macOS) and /home/username
(Ubuntu).
If pasting code examples containing a for
loop or with
statement, press enter twice to execute the code and return to the >>>
Python interpreter prompt.
The Structure of JSON-encoded Data
A more detailed description on the structure of JSON-encoded data can be found here, but for the purposes of this post:
- JSON-encoded data consists of arrays, objects, values and name/value pairs.
- An array is surrounded by square brackets
[...]
and contains a comma-separated list of values. - An object is surrounded by braces
{...}
and contains a comma-separated list of name/value pairs. - A name/value pair consists of a name in double quotes
"..."
, followed by a colon:
, followed by a value. - A value can be a string, a number, a boolean, null, another array or another object.
Consider the following file containing JSON-encoded data on two versions of macOS:
{ "updated": "2020-07-09", "versions": [ { "family": "macOS", "version": "10.14", "codename": "Mojave", "announced": "2018-06-04", "released": "2018-09-24" }, { "family": "macOS", "version": "10.15", "codename": "Catalina", "announced": "2019-06-03", "released": "2019-10-07" } ] }
- The top-level object defines two name/value pairs.
- The first is
updated
and has a string value of2020-07-09
representing the date the information in the file was last updated. - The second is
versions
whose value is an array. - Each value in the
versions
array is an object representing a single macOS version. - Each object representing a single macOS version contains information on that version in the form of five name/value pairs:
family
,version
,codename
,announced
andreleased
. All values are strings.
Accessing JSON-encoded Data in Python
To allow Python to access the JSON-encoded data we first need to open the file and then read its contents into memory. The latter is known as decoding or deserializing and in Python is performed using the load()
method from Python’s json
library. As part of this deserializing process a JSON array is converted to a Python list []
and a JSON object is converted to a Python dictionary {}
.
import os import json f = open(os.environ["HOME"] + "/macos.json", "r", encoding="utf-8") data = json.load(f) f.close()
The deserialized JSON-encoded data is now stored in the variable data
. The output has been re-formatted for clarity:
print(data)
{ 'updated': '2020-07-09', 'versions': [ { 'family': 'macOS', 'version': '10.14', 'codename': 'Mojave', 'announced': '2018-06-04', 'released': '2018-09-24' }, { 'family': 'macOS', 'version': '10.15', 'codename': 'Catalina', 'announced': '2019-06-03', 'released': '2019-10-07' } ] }
Apart from double-quotes "
being replaced with single-quotes '
, this output looks identical to the JSON-encoded data in our file. Remember however, what constitutes an array in the JSON-encoded data is now a Python list []
and what are objects in that same data are now Python dictionaries {}
.
To access a specific piece of data we can use bracket notation. For example, to get the date the information was last updated:
print(data["updated"])
2020-07-09
To get specific information on a particular macOS version is a little more involved:
- Information on both macOS versions is contained in the
versions
list. - Each value of the
versions
list is a dictionary containing information on a single macOS version. - We can access the information in a specific dictionary by using its position or index in the
versions
list. As Python lists are zero-indexed, the first dictionary has an index of0
, the second an index of1
.
NOTE: JSON Editor Online is a useful tool that displays JSON-encoded data showing the number of values in a list (JSON array) and each value’s index within that list. Please note you need to switch to tree view. Also, the code editor Visual Studio Code has an Outline view that displays JSON-encoded data in a similar way.
So, to get the date macOS Mojave was announced we need to target the first dictionary of the versions
list:
print(data["versions"][0]["announced"])
2018-06-04
To get the dates both macOS Mojave and macOS Catalina were released we need to target the first dictionary of the versions
list and then its second dictionary:
print(data["versions"][0]["released"]); \ print(data["versions"][1]["released"])
2018-09-24 2019-10-07
Let’s make things a little more interesting and add some more information for each macOS version to our JSON-encoded data:
{ "updated": "2020-07-09", "versions": [ { "family": "macOS", "version": "10.14", "codename": "Mojave", "announced":"2018-06-04", "released": "2018-09-24", "requirements": [ "iMac (Late 2012 or newer)", "iMac Pro (2017)", "Mac Mini (Late 2012 or newer)", "Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards)", "MacBook (Early 2015 or newer)", "MacBook Air (Mid 2012 or newer)", "MacBook Pro (Mid 2012 or newer)", "2 GB of memory", "12.5 - 18.5 GB of available avaialable disk space", "OS X 10.8 or later" ], "releases": [ { "version": "10.14", "build": "18A391", "darwin": "18.0.0", "released": "2018-09-24" }, { "version": "10.14.1", "build": "18B75", "darwin": "18.2.0", "released": "2018-10-30" }, { "version": "10.14.2", "build": "18C54", "darwin": "18.2.0", "released": "2018-12-05" }, { "version": "10.14.3", "build": "18D42", "darwin": "18.2.0", "released": "2019-01-22" }, { "version": "10.14.4", "build": "18E226", "darwin": "18.5.0", "released": "2019-03-25" }, { "version": "10.14.5", "build": "18F132", "darwin": "18.6.0", "released": "2019-05-13" }, { "version": "10.14.6", "build": "18G84", "darwin": "18.7.0", "released": "2019-07-22" } ] }, { "family": "macOS", "version": "10.15", "codename": "Catalina", "announced":"2019-06-03", "released": "2019-10-07", "requirements": [ "iMac (Late 2012 or newer)", "iMac Pro (2017)", "Mac Mini (Late 2012 or newer)", "Mac Pro (Late 2013)", "MacBook (Early 2015 or newer)", "MacBook Air (Mid 2012 or newer)", "MacBook Pro (Mid 2012 or newer)", "4 GB of memory", "12.5 GB of available avaialable disk space", "OS X 10.11.5 or later" ], "releases": [ { "version": "10.15", "build": "19A583", "darwin": "19.0.0", "released": "2019-10-07" }, { "version": "10.15.1", "build": "19B88", "darwin": "19.0.0", "released": "2019-10-29" }, { "version": "10.15.2", "build": "19C57", "darwin": "19.2.0", "released": "2019-12-10" }, { "version": "10.15.3", "build": "19D76", "darwin": "19.3.0", "released": "2020-01-28" }, { "version": "10.15.4", "build": "19E266", "darwin": "19.4.0", "released": "2020-03-24" }, { "version": "10.15.5", "build": "19F96", "darwin": "19.5.0", "released": "2020-05-26" } ] } ] }
In Python, the updated JSON-encoded data in the file needs to be deserialized again:
import os import json with open(os.environ["HOME"] + "/macos.json", "r", encoding="utf-8") as f: data = json.load(f)
Here we use a with
statement with the open
function. This ensures better exception handling and doesn’t require the close()
function to be called explicitly as the with
statement handles the proper acquisition and release of resources.
The updated deserialized data. The output has been re-formatted for clarity:
print(data)
{ 'updated': '2020-07-09', 'versions': [ { 'family': 'macOS', 'version': '10.14', 'codename': 'Mojave', 'announced': '2018-06-04', 'released': '2018-09-24', 'requirements': [ 'iMac (Late 2012 or newer)', 'iMac Pro (2017)', 'Mac Mini (Late 2012 or newer)', 'Mac Pro (Late 2013; Mid 2010 and ' 'Mid 2012 models with recommended ' 'Metal-capable graphics cards)', 'MacBook (Early 2015 or newer)', 'MacBook Air (Mid 2012 or newer)', 'MacBook Pro (Mid 2012 or newer)', '2 GB of memory', '12.5 - 18.5 GB of available ' 'avaialable disk space', 'OS X 10.8 or later' ], 'releases': [ { 'version': '10.14', 'build': '18A391', 'darwin': '18.0.0', 'released': '2018-09-24' }, { 'version': '10.14.1', 'build': '18B75', 'darwin': '18.2.0', 'released': '2018-10-30' }, { 'version': '10.14.2', 'build': '18C54', 'darwin': '18.2.0', 'released': '2018-12-05' }, { 'version': '10.14.3', 'build': '18D42', 'darwin': '18.2.0', 'released': '2019-01-22' }, { 'version': '10.14.4', 'build': '18E226', 'darwin': '18.5.0', 'released': '2019-03-25' }, { 'version': '10.14.5', 'build': '18F132', 'darwin': '18.6.0', 'released': '2019-05-13' }, { 'version': '10.14.6', 'build': '18G84', 'darwin': '18.7.0', 'released': '2019-07-22' } ] }, { 'family': 'macOS', 'version': '10.15', 'codename': 'Catalina', 'announced': '2019-06-03', 'released': '2019-10-07', 'requirements': [ 'iMac (Late 2012 or newer)', 'iMac Pro (2017)', 'Mac Mini (Late 2012 or newer)', 'Mac Pro (Late 2013)', 'MacBook (Early 2015 or newer)', 'MacBook Air (Mid 2012 or newer)', 'MacBook Pro (Mid 2012 or newer)', '4 GB of memory', '12.5 GB of available avaialable ' 'disk space', 'OS X 10.11.5 or later' ], 'releases': [ { 'version': '10.15', 'build': '19A583', 'darwin': '19.0.0', 'released': '2019-10-07' }, { 'version': '10.15.1', 'build': '19B88', 'darwin': '19.0.0', 'released': '2019-10-29' }, { 'version': '10.15.2', 'build': '19C57', 'darwin': '19.2.0', 'released': '2019-12-10' }, { 'version': '10.15.3', 'build': '19D76', 'darwin': '19.3.0', 'released': '2020-01-28' }, { 'version': '10.15.4', 'build': '19E266', 'darwin': '19.4.0', 'released': '2020-03-24' }, { 'version': '10.15.5', 'build': '19F96', 'darwin': '19.5.0', 'released': '2020-05-26' } ] } ] }
- Each dictionary in the
versions
list now contains two additional name/value pairs:requirements
andreleases
. requirements
is a list whose values are all strings with each string representing a minimum system requirement to run that particular macOS version.releases
is also a list, but each of its values is a dictionary representing information on each release of a single macOS version provided by four name/value pairs:version
,build
,darwin
andreleased
.
To get the first minimum system requirement to run macOS Mojave:
print(data["versions"][0]["requirements"][0])
iMac (Late 2012 or newer)
To get the first and second minimum system requirement to run macOS Catalina:
print(data["versions"][1]["requirements"][0]); \ print(data["versions"][1]["requirements"][1])
iMac (Late 2012 or newer) iMac Pro (2017)
To get the build number of the first release of macOS Catalina:
print(data["versions"][1]["releases"][0]["build"])
19A583
To get the build number of the eighth release of macOS Mojave. Whoops! There have only been seven releases of macOS Mojave:
print(data["versions"][0]["releases"][7]["build"])
Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: list index out of range
To get the build number and Darwin version of the first release of macOS Catalina:
print(data["versions"][1]["releases"][0]["build"] +" "+ data["versions"][1]["releases"][0]["darwin"])
19A583 19.0.0
To get the version, build number and Darwin version of the first, second and third release of macOS Mojave:
print(data["versions"][0]["releases"][0]["version"] +" "+ data["versions"][0]["releases"][0]["build"] +" "+ data["versions"][0]["releases"][0]["darwin"]); \ print(data["versions"][0]["releases"][1]["version"] +" "+ data["versions"][0]["releases"][1]["build"] +" "+ data["versions"][0]["releases"][1]["darwin"]); \ print(data["versions"][0]["releases"][2]["version"] +" "+ data["versions"][0]["releases"][2]["build"] +" "+ data["versions"][0]["releases"][2]["darwin"])
10.14 18A391 18.0.0 10.14.1 18B75 18.2.0 10.14.2 18C54 18.2.0
Looping Through JSON-encoded Data
As is evident from the examples above, accessing multiple values in lists is cumbersome and error prone. Each value – whether it be a string, a number, boolean, null or a dictionary – has to be explicitly targeted using its position or index in the list. Often, the total number of values in a list is unknown beforehand or the number of values in similarly named lists differ resulting in index out of range errors. For example, the releases
list for macOS Mojave contains seven values (dictionaries), but for macOS Catalina contains only six.
To overcome this we can use a for
statement to loop or iterate through every value in the list repeating the same steps during each loop or iteration.
For example, to get the codename of every macOS version without using a for
loop:
print(data["versions"][0]["codename"]); \ print(data["versions"][1]["codename"])
Mojave Catalina
Alternatively, to get the codename of every macOS version using a for
loop:
for _version in data["versions"]: print(_version["codename"])
Mojave Catalina
The loop repeats for every value (dictionary) in the versions
list. On each loop, the current list value is written to the variable _version
. Subsequently, we use _version["codename"]
to get the codename of that macOS version.
To get the codename of every macOS version together with their minimum system requirements without using a for
loop:
print(data["versions"][0]["codename"]); \ print(" " + data["versions"][0]["requirements"][0]); \ print(" " + data["versions"][0]["requirements"][1]); \ print(" " + data["versions"][0]["requirements"][2]); \ print(" " + data["versions"][0]["requirements"][3]); \ print(" " + data["versions"][0]["requirements"][4]); \ print(" " + data["versions"][0]["requirements"][5]); \ print(" " + data["versions"][0]["requirements"][6]); \ print(" " + data["versions"][0]["requirements"][7]); \ print(" " + data["versions"][0]["requirements"][8]); \ print(" " + data["versions"][0]["requirements"][9]); \ print(data["versions"][1]["codename"]); \ print(" " + data["versions"][1]["requirements"][0]); \ print(" " + data["versions"][1]["requirements"][1]); \ print(" " + data["versions"][1]["requirements"][2]); \ print(" " + data["versions"][1]["requirements"][3]); \ print(" " + data["versions"][1]["requirements"][4]); \ print(" " + data["versions"][1]["requirements"][5]); \ print(" " + data["versions"][1]["requirements"][6]); \ print(" " + data["versions"][1]["requirements"][7]); \ print(" " + data["versions"][1]["requirements"][8]); \ print(" " + data["versions"][1]["requirements"][9])
Mojave iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 2 GB of memory 12.5 - 18.5 GB of available avaialable disk space OS X 10.8 or later Catalina iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 4 GB of memory 12.5 GB of available avaialable disk space OS X 10.11.5 or later
Alternatively, to get the same information using a for
loop:
for _version in data["versions"]: print(_version["codename"]) for _requirement in _version["requirements"]: print(" " + _requirement)
Mojave iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 2 GB of memory 12.5 - 18.5 GB of available avaialable disk space OS X 10.8 or later Catalina iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 4 GB of memory 12.5 GB of available avaialable disk space OS X 10.11.5 or later
In this instance the for
loops are nested. The outer for
loop is the same as in the previous example. The inner
for
loop repeats for every value (string) in the requirements
list. On each loop, the current list value is written to the variable _requirement
. Subsequently, to get each requirement we simply use _requirement
.
Because the requirements
list is contained by the versions
list we have to nest the for
loops. Not doing so will give incomplete results where only the requirements
list of the last value in the versions
list is targeted:
for _version in data["versions"]: print(_version["codename"]) for _requirement in _version["requirements"]: print(" " + _requirement)
Mojave Catalina iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 4 GB of memory 12.5 GB of available avaialable disk space OS X 10.11.5 or later
Finally, to additionally include the version of every macOS release without using a for
loop:
print(data["versions"][0]["codename"]); \ print(" " + data["versions"][0]["requirements"][0]); \ print(" " + data["versions"][0]["requirements"][1]); \ print(" " + data["versions"][0]["requirements"][2]); \ print(" " + data["versions"][0]["requirements"][3]); \ print(" " + data["versions"][0]["requirements"][4]); \ print(" " + data["versions"][0]["requirements"][5]); \ print(" " + data["versions"][0]["requirements"][6]); \ print(" " + data["versions"][0]["requirements"][7]); \ print(" " + data["versions"][0]["requirements"][8]); \ print(" " + data["versions"][0]["requirements"][9]); \ print(" " + data["versions"][0]["releases"][0]["version"]); \ print(" " + data["versions"][0]["releases"][1]["version"]); \ print(" " + data["versions"][0]["releases"][2]["version"]); \ print(" " + data["versions"][0]["releases"][3]["version"]); \ print(" " + data["versions"][0]["releases"][4]["version"]); \ print(" " + data["versions"][0]["releases"][5]["version"]); \ print(" " + data["versions"][0]["releases"][6]["version"]); \ print(data["versions"][1]["codename"]); \ print(" " + data["versions"][1]["requirements"][0]); \ print(" " + data["versions"][1]["requirements"][1]); \ print(" " + data["versions"][1]["requirements"][2]); \ print(" " + data["versions"][1]["requirements"][3]); \ print(" " + data["versions"][1]["requirements"][4]); \ print(" " + data["versions"][1]["requirements"][5]); \ print(" " + data["versions"][1]["requirements"][6]); \ print(" " + data["versions"][1]["requirements"][7]); \ print(" " + data["versions"][1]["requirements"][8]); \ print(" " + data["versions"][1]["requirements"][9]); \ print(" " + data["versions"][1]["releases"][0]["version"]); \ print(" " + data["versions"][1]["releases"][1]["version"]); \ print(" " + data["versions"][1]["releases"][2]["version"]); \ print(" " + data["versions"][1]["releases"][3]["version"]); \ print(" " + data["versions"][1]["releases"][4]["version"]); \ print(" " + data["versions"][1]["releases"][5]["version"])
Mojave iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 2 GB of memory 12.5 - 18.5 GB of available avaialable disk space OS X 10.8 or later 10.14 10.14.1 10.14.2 10.14.3 10.14.4 10.14.5 10.14.6 Catalina iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 4 GB of memory 12.5 GB of available avaialable disk space OS X 10.11.5 or later 10.15 10.15.1 10.15.2 10.15.3 10.15.4 10.15.5
Alternatively, using a for
loop:
for _version in data["versions"]: print(_version["codename"]) for _requirement in _version["requirements"]: print(" " + _requirement) for _release in _version["releases"]: print(" " + _release["version"])
Mojave iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013; Mid 2010 and Mid 2012 models with recommended Metal-capable graphics cards) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 2 GB of memory 12.5 - 18.5 GB of available avaialable disk space OS X 10.8 or later 10.14 10.14.1 10.14.2 10.14.3 10.14.4 10.14.5 10.14.6 Catalina iMac (Late 2012 or newer) iMac Pro (2017) Mac Mini (Late 2012 or newer) Mac Pro (Late 2013) MacBook (Early 2015 or newer) MacBook Air (Mid 2012 or newer) MacBook Pro (Mid 2012 or newer) 4 GB of memory 12.5 GB of available avaialable disk space OS X 10.11.5 or later 10.15 10.15.1 10.15.2 10.15.3 10.15.4 10.15.5
The last for
loop repeats for every value (dictionary) in the releases
list. On each loop, the current list value is written to the variable _release
. Subsequently, we use _release["version"]
to get the version of each release.
Similar to the requirements
list, the releases
list is also contained within the versions
list, so the for
loop targeting the releases
list is also nested within the for
loop targeting the versions
list.
Notes
The information on macOS (Mac OS X) releases contained in the file
macos.json
is for demonstration purposes only. The version of the file used in this post is a subset of the data contained in the original. The latest, complete version can be found here. While every attempt has been made to ensure this data is correct, its accuracy is not guaranteed.parse-json.py is a Python script used to parse the JSON-encoded data in the file
macos.json
and is based on the examples in this post. A Node.js application that runs this Python script and displays the results can be found at macos.tech-otaku.comThe GitHub repository for the Node.js application can be found at macos-versions.
Just the explanation I needed to get a comprehensive understand of Python loops through JSON data.
An excellent, concise but fully detailed explanation.
Thank you very much. You have saved me many hours of trying to figure this out by trial & error.