Generating Climate Temperature Spirals in Python

Posted on Thu 29 August 2019 in posts • 7 min read

Hey there!¶

Welcome to my first post!

In this post, we will try to recreate the famous animated visulatization of climate scientist Ed Hawkins, that captivated the crowds from the moment Hawkins tweeted it back in 2017.
By the end of this post, you will be able to:

Manipulate data using simple Pandas and Numpy functions
learn how to plot with Polar coordinates using Matplotlib
Create an animation from multiple plots using Matplotlib's animation module

This project was inspired by Dataquest's post. To try the code by yourself, kindly visit my Github repository. Any comments are welcomed!

Let's start by taking a look at the original animation:

In [8]:

%%html
<style>.iframe-container {
  overflow: hidden;
  padding-top: 56.25%;
  position: relative;
}
 
.iframe-container iframe {
   border: 0;
   height: 100%;
   left: 0;
   position: absolute;
   top: 0;
   width: 100%;
}</style>
<div class="iframe-container">
<iframe src="https://www.youtube.com/embed/wXrYvd-LBu0" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
</div>

This visualization shows the deviations from the average temperature between 1850 and 2016. It was reshared millions of times over Twitter and Facebook. To understand the motivation behind this animation, check Ed Hawkins' website.

Exploring the Dataset¶

The underlying data was released by the Met Office in the United Kingdon, which does excellent work on weather and climate forecasting. The dataset can be downloaded directly here.
Let's first import the libraries needed in this project:

In [2]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

Let's download this dataset to our workspace:

In [3]:

from urllib.request import Request,urlopen
req=Request("https://www.metoffice.gov.uk/hadobs/hadcrut4/data/current/time_series/HadCRUT.4.6.0.0.monthly_ns_avg.txt",headers={'User-Agent': 'Mozilla/5.0'})
content=urlopen(req).read()
file=open("hadcrut.txt","wb")
file.write(content)
file.close()

Next, we need to read the dataset into a Pandas DataFrame:

In [4]:

hadcrut=pd.read_csv(file.name,delim_whitespace=True,usecols=[0,1],header=None)
hadcrut.head()

Out[4]:

	0	1
0	1850/01	-0.700
1	1850/02	-0.286
2	1850/03	-0.732
3	1850/04	-0.563
4	1850/05	-0.327

This dataset contains two columns:

The first column represents the month/year of recording
The second column represents the deviations from average temperature

Data Cleaning¶

Now, we need to:

split the first column into month and year columns
rename the column 1 to value
select and save all but the column 0

In [5]:

hadcrut["month"]=hadcrut[0].str.split("/").str[1].astype(int)
hadcrut["year"]=hadcrut[0].str.split("/").str[0].astype(int)
hadcrut.rename(columns={1:"value"},inplace=True)
hadcrut=hadcrut[["value","month","year"]].copy()

In [6]:

hadcrut.head()

Out[6]:

	value	month	year
0	-0.700	1	1850
1	-0.286	2	1850
2	-0.732	3	1850
3	-0.563	4	1850
4	-0.327	5	1850

In [7]:

hadcrut["year"].value_counts(ascending=True).head()

Out[7]:

2019     6
1958    12
1959    12
1960    12
1961    12
Name: year, dtype: int64

In order to keep our data consistent and tidy, we will remove the rows containing data from 2019, since it is the only year with 6 months, not 12 months:

In [8]:

hadcrut=hadcrut.drop(hadcrut[hadcrut["year"]==2019].index)
hadcrut["year"].value_counts(ascending=True).head()

Out[8]:

1850    12
1957    12
1958    12
1959    12
1960    12
Name: year, dtype: int64

Lastly, let’s compute the mean of the global temperatures from 1850 to 1900 and subtract that value from the entire dataset. To make this easier, we’ll create a multiindex using the year and month columns:

In [9]:

hadcrut=hadcrut.set_index(["year","month"])
hadcrut.head(20)

Out[9]:

		value
year	month
1850	1	-0.700
	2	-0.286
	3	-0.732
	4	-0.563
	5	-0.327
	6	-0.213
	7	-0.125
	8	-0.237
	9	-0.439
	10	-0.451
	11	-0.187
	12	-0.257
1851	1	-0.296
	2	-0.356
	3	-0.479
	4	-0.441
	5	-0.295
	6	-0.197
	7	-0.212
	8	-0.157

This way, we will be only modifying the "value" column:

In [10]:

hadcrut -= hadcrut.loc[1850:1900].mean()
hadcrut.head()

Out[10]:

		value
year	month
1850	1	-0.386559
	2	0.027441
	3	-0.418559
	4	-0.249559
	5	-0.013559

Let's reset the index to its default layout:

In [11]:

hadcrut=hadcrut.reset_index()
hadcrut.head()

Out[11]:

	year	month	value
0	1850	1	-0.386559
1	1850	2	0.027441
2	1850	3	-0.418559
3	1850	4	-0.249559
4	1850	5	-0.013559

Preparing data for polar plotting¶

The key steps to recreate the visualization:

transforming the data for polar visualization
customizing the aesthetics of the plot
stepping through the visualization year-by-year and turning the plot into a GIF

Let's start by plotting the data for the 1850 in polar coordinates:
It is important first to adjust the data to contain no negative values, let's find the minimum temperature value:

In [12]:

hadcrut["value"].min()

Out[12]:

-0.6605588235294118

Let’s add 1 to all temperature values, so they’ll be positive but there’s still some space reserved around the origin for displaying text:

In [13]:

hc_1850=hadcrut[hadcrut["year"]==1850]
r=hc_1850["value"]+1
theta=np.linspace(0,2*np.pi,12)

In [14]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")
ax1.plot(theta,r)
plt.show()

Tweaking the Aesthetics¶

Let's remove the tick labels for both axes:

In [15]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")
ax1.plot(theta,r)
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_xticks([])
ax1.set_yticks([])
plt.show()

Next, let's tweak the color; we need the background color within the polar plot to be black, and the color surrounding the polar plot to be gray:

In [16]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")
ax1.plot(theta,r)
ax1.set_xticklabels([])
ax1.set_yticklabels([])
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_xticks([])
ax1.set_yticks([])
plt.show()

Next, let's add the title:

In [17]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")
ax1.plot(theta,r)
ax1.set_xticklabels([])
ax1.set_yticklabels([])
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_xticks([])
ax1.set_yticks([])
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
plt.show()

Lastly, let’s add the text in the center that specifies the current year that’s being visualized:

In [18]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")
ax1.plot(theta,r)
ax1.set_xticklabels([])
ax1.set_yticklabels([])
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
ax1.text(0,0,"1850",color="white",size=30,ha="center")
ax1.set_xticks([])
ax1.set_yticks([])
plt.show()

Plotting the remaining years¶

It is important here is to manually set the axis limit for r (or y in matplotlib). This is because matplotlib scales the size of the plot automatically based on the data that’s used. This is why, in the last step, the data for just 1850 was displayed at the edge of the plotting area.
To mimick the original animation, let’s calculate the maximum temperature value in the entire dataset and add a generous amount of padding:

In [19]:

hadcrut["value"].max()

Out[19]:

1.4244411764705882

In [20]:

ax1.set_ylim(0,3.25)

Out[20]:

(0, 3.25)

Next, let's loop over the rest of the data to generate the plots for the rest of the years:

In [21]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")

ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
ax1.set_xticks([])
ax1.set_yticks([])

theta = np.linspace(0, 2*np.pi, 12)
years=hadcrut["year"].unique()

for year in years:
  r=hadcrut.loc[hadcrut["year"]==year,"value"]+1
  ax1.plot(theta,r)
plt.show()

Customizing the colors¶

Right now, the colors feel a bit random and don’t correspond to the gradual heating of the climate that the original visualization conveys well. In the original visualization, the colors transition from blue / purple, to green, to yellow. This color scheme is known as sequential colormap, because the progression of colors has a meaning in the data.
Essentially, we use the color parameter in the Axes.plot() method and draw colors from plt.cm.(index*2) to progress from blue to green and eventually reach yellow:

In [22]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")

ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
ax1.set_xticks([])
ax1.set_yticks([])

theta = np.linspace(0, 2*np.pi, 12)
years=hadcrut["year"].unique()

for index,year in enumerate(years):
  r=hadcrut.loc[hadcrut["year"]==year,"value"]+1
  ax1.plot(theta,r,c=plt.cm.viridis(index*2))
plt.show()

Adding Temperature Rings¶

At this stage, the viewer can't actually understand the underlying data at all. There is no indication of temperture values in the visualization.
Next, we will add temperature rings at 0.0, 1.5, 2.0 degrees Celsius:

In [23]:

full_circle_thetas=np.linspace(0,2*np.pi,1000)
blue_one_radii=[0.0+1.0]*1000
red_one_radii=[1.5+1.0]*1000
red_two_radii=[2.0+1.0]*1000

In [24]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")

ax1.plot(full_circle_thetas, blue_one_radii, c='blue')
ax1.plot(full_circle_thetas, red_one_radii, c='red')
ax1.plot(full_circle_thetas, red_two_radii, c='red')
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
ax1.set_xticks([])
ax1.set_yticks([])

theta = np.linspace(0, 2*np.pi, 12)
years=hadcrut["year"].unique()

for index,year in enumerate(years):
  r=hadcrut.loc[hadcrut["year"]==year,"value"]+1
  ax1.plot(theta,r,c=plt.cm.viridis(index*2))
plt.show()

Next, we can add the text specifying the ring’s temperature values. All 3 of these text values are at the 0.5*pi angle, at varying distance values:

In [25]:

fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")

ax1.plot(full_circle_thetas, blue_one_radii, c='blue')
ax1.plot(full_circle_thetas, red_one_radii, c='red')
ax1.plot(full_circle_thetas, red_two_radii, c='red')
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=25)
ax1.set_xticks([])
ax1.set_yticks([])
ax1.text(np.pi/2, 0.90, "0.0 C", color="blue", ha='center')
ax1.text(np.pi/2, 2.40, "1.5 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))
ax1.text(np.pi/2, 2.90, "2.0 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))

theta = np.linspace(0, 2*np.pi, 12)
years=hadcrut["year"].unique()

for index,year in enumerate(years):
  r=hadcrut.loc[hadcrut["year"]==year,"value"]+1
  ax1.plot(theta,r,c=plt.cm.viridis(index*2))

plt.show()

Lastly, let's add month values to the outer rim of the polar plot:

In [28]:

months=["Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec"]
fig=plt.figure(figsize=(8,8))
ax1=plt.subplot(111,projection="polar")

ax1.plot(full_circle_thetas, blue_one_radii, c='blue')
ax1.plot(full_circle_thetas, red_one_radii, c='red')
ax1.plot(full_circle_thetas, red_two_radii, c='red')
ax1.set_xticks([])
ax1.set_yticks([])
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-{})".format(hadcrut["year"].max()),color="white",fontsize=25,ha="center")
ax1.text(np.pi/2, 0.90, "0.0 C", color="blue", ha='center',fontsize= 15)
ax1.text(np.pi/2, 2.40, "1.5 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))
ax1.text(np.pi/2, 2.90, "2.0 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))

theta = np.linspace(0, 2*np.pi, 12)
years=hadcrut["year"].unique()

fig.text(0.78,0,"HadCRUT 4.6",color="white",fontsize=15)
fig.text(0.05,0.02,"Anis Ismail",color="white",fontsize=15)
fig.text(0.05,0,"Based on Ed Hawkins's 2017 Visualization",color="white",fontsize=10)

#add months ring
months_angles= np.linspace((np.pi/2)+(2*np.pi),np.pi/2,13)
for i,month in enumerate(months):
  ax1.text(months_angles[i],3.4,month,color="white",fontsize=15,ha="center")

for index,year in enumerate(years):
  r=hadcrut.loc[hadcrut["year"]==year,"value"]+1
  ax1.plot(theta,r,c=plt.cm.viridis(index*2))

plt.tight_layout()
plt.show()

Generating The GIF Animation¶

Now we’re ready to generate a GIF animation from the plot. An animation is a series of images that are displayed in rapid succession. We’ll use the matplotlib.animation.FuncAnimation function to help us with this:

In [27]:

from matplotlib.animation import FuncAnimation

fig = plt.figure(figsize=(8,8))
ax1 = plt.subplot(111, projection='polar')
ax1.plot(full_circle_thetas, blue_one_radii, c='blue')
ax1.plot(full_circle_thetas, red_one_radii, c='red')
ax1.plot(full_circle_thetas, red_two_radii, c='red')
ax1.set_xticklabels([])
ax1.set_yticklabels([])
ax1.set_ylim(0,3.25)
fig.set_facecolor("#323331")
ax1.set_facecolor("#000100")
ax1.set_title("Global Temperature Change (1850-2018)",color="white",fontsize=20)
ax1.set_xticks([])
ax1.set_yticks([])
#ax1.text(np.pi/2, 0.90, "0.0 C", color="blue", ha='center',fontsize= 15)
ax1.text(np.pi/2, 2.40, "1.5 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))
ax1.text(np.pi/2, 2.90, "2.0 C", color="red", ha='center', fontsize= 15,bbox=dict(facecolor='#000100', edgecolor='#000100'))

fig.text(0.78,0.01,"HadCRUT 4.6",color="white",fontsize=15)
fig.text(0.05,0.03,"Anis Ismail",color="white",fontsize=15)
fig.text(0.05,0.01,"Based on Ed Hawkins's 2017 Visualization",color="white",fontsize=10)

months_angles= np.linspace((np.pi/2)+(2*np.pi),np.pi/2,13)
for i,month in enumerate(months):
  ax1.text(months_angles[i],3.4,month,color="white",fontsize=15,ha="center")

def update(i):
    # Remove the last year text at the center
    for txt in ax1.texts:
      if(txt.get_position()==(0,0)):
        txt.set_visible(False)
    # Specify how we want the plot to change in each frame.
    # We need to unravel the for loop we had earlier.
    year = years[i]
    r = hadcrut[hadcrut['year'] == year]['value'] + 1
    ax1.plot(theta, r, c=plt.cm.viridis(i*2))
    ax1.text(0,0,year,fontsize=20,color="white",ha="center")
    return ax1

anim = FuncAnimation(fig, update, frames=len(years), interval=100)

#instead of installing ffmpeg we will use matplotlib.animation.HTMLWriter
#from matplotlib.animation import FFMpegWriter
#plt.rcParams['animation.ffmpeg_path'] = 'C:\\Program Files\\ffmpeg\\bin\\ffmpeg.exe'
#mywriter = FFMpegWriter()

anim.save('climate_spiral.html',savefig_kwargs={'facecolor': '#323331'})

MovieWriter ffmpeg unavailable; trying to use <class 'matplotlib.animation.HTMLWriter'> instead.