Problem Set 2.1 - Arctic ice maps

Total points: 30

Due: Friday October 14 7pm CEST

Format: IPython Notebook or python program

Author: Thomas Robitaille

Solved by: Marta Reina-Campos

The purpose of this problem set is to become familiar with working with image data, plotting it, and combining it in various ways for analysis.

The data used in this problem set was collected by the AMSR-E instrument Aqua satellite. The data consists of maps of the concentration of ice in the Arctic collected between 2006 and 2011. The data were downloaded and extracted from here and converted into a format that can be more easily used.

The data you should use is in here (located at MPIA, use this link if you need a zip archive) and can also be accessed in /local/py4sci/ice_data if you are using the CIP computers (no need to copy over that directory, just read the files from there). To use it, copy the tgz file into your day3 directory and type:

 tar -xvzf ice.tgz

The data is in 'Numpy' format, which means that you can read it as a Numpy array using:

>>> import numpy as np
>>> data = np.load('ice_data/20080415.npy')

which will give you a 2-d array. Just for information, this was created with:

>>> np.save('ice_data/20080415.npy', data)
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt

Part 1 - examining a single map (6 points)

Start off by reading in one of the maps as shown above, and plot it with Matplotlib. Note that to get the correct orientation, you will need to call the imshow command with the origin='lower' option, which ensures that the (0,0) pixels is on the bottom left, not the top left. You can try and use different colormaps if you like (set by the cmap option) - see here for information on the available colormaps. You can specify a colormap to use with e.g. cmap=plt.cm.jet (i.e. cmap=plt.cm. followed by the name of the colormap). Note that you can make figures larger by specifying e.g.

>>> plt.figure(figsize=(8,8))

where the size is given in inches. Try and find a way to plot a colorbar on the side, to show what color corresponds to what value. Examples can be found in the Matplotlib Gallery. You may also want to try to remove the tick labels (100, 200, etc.) since they are not useful.

In [2]:
import numpy

path_file = 'PS02-ice_data/ice_data/20080415.npy'                           # path of the input file

data = numpy.load(path_file)                                                # open the input file
plt.figure(figsize = (8,8))                                                 # set the figure size
plt.imshow(data, cmap = plt.cm.GnBu_r, origin = "lower")                    # plot the data in the input file
ticks_colorbar = numpy.linspace(numpy.nanmin(data), numpy.nanmax(data), 11) # determine what ticks will appear 
                                                                            # in the colorbar
plt.colorbar(ticks = ticks_colorbar)                                        # add colorbar to the figure    
plt.axis('off')                                                             # take out axis
plt.title("Map of the concentration of ice in the Artic - 2008/04/15")      # title of the figure
plt.show()                                                                  # show the figure

Part 2 - reading in multiple maps (10 points)

We now want to make a plot of the ice concentration over time. Reading in a single map is easy, but since we have 137 maps, we do not want to read them all in individually by hand. Write a loop over all the available files, and inside the loop, read in the data to a variable (e.g. data), and also extract the year, month, and day as integer values (e.g. year, month, and day). Then, also inside the loop, compute a variable time which is essentially the fractional time in years (so 1st July 2012 is 2012.5). You can assume for simplicity that each month has 30 days - this will not affect the results later. Finally, also compute for each file the total number of pixels that have a value above 50%. After the loop, make a plot of the number of pixels with a concentration above 50% against time.

You will likely notice that the ticks are in a strange format, where they are given in years since 2006, but you can change this with the following code:

>>> from matplotlib.ticker import ScalarFormatter
>>> plt.gca().xaxis.set_major_formatter(ScalarFormatter(useOffset=False))

Describe what you see in the plot.

We now want something a little more quantitative than just the number of pixels, so we will try and compute the area where the ice concentration is above a given threshold. However, we first need to know the area of the pixels in the image, and since we are looking at a projection of a spherical surface, each pixel will be a different area. The areas (in km^2) are contained inside the file named ice_data_area.npy (if you are using the CIP pool, this is /local/py4sci/ice_data_area.npy). Read in the areas and make a plot (with colorbar) to see how the pixel area is changing over the image.

Now, loop over the files again as before, but this time, for each file, compute the total area where the concentration of ice is 99% or above. Make a new plot showing the area of >99% ice concentration against time.

Describe what you see - how does the minimum change over time?

In [3]:
### PART 1 - Plot the variation of the number of pixels over 50% with time

import glob                                                                 # import modules
from matplotlib.ticker import ScalarFormatter
import os

path = 'PS02-ice_data/ice_data'                                             # path of the input files
list_filenames = glob.glob(os.path.join(path,'*.npy'))                      # list of ice maps

date = []
time = numpy.ndarray(shape=len(list_filenames))                             # we define empty arrays of the 
pixels_over_half = numpy.ndarray(shape=len(list_filenames))                 # appropiate length

for filename in list_filenames:                                             # per each ice map
    index = list_filenames.index(filename)                                  # determine its index
    date.append(filename[len(path)+1:-4])                                   # get its date from the name
    year = int(date[index][:4]);                                            # determine the year, month and day
    month = int(date[index][4:6]);
    day = int(date[index][6:]) 
    time[index] = year + (month-1)/12.0 + (day-1)/(30.0*12)                 # compute its fractional time
    
    data = numpy.load(filename)                                             # load the ice map data
    data[numpy.isnan(data)] = 0                                             # set all NaNs to zero
    pixels_over_half[index] = len(data[data > 50])                          # count the number of pixels over 50 %
    
plt.figure(figsize=(8,8))                                                   # set the size of the figure
plt.plot(time, pixels_over_half/10**4)                                      # plot the number of pixels vs time
plt.ylabel(r"Pixels over 50% [$\times10^4$]")                               # set the y-label
plt.xlabel("Time")                                                          # set the x-label
plt.xticks(time[time%0.25 == 0], rotation = 'vertical')                     # use as xticks times multiples of 0.25
plt.gca().xaxis.set_major_formatter(ScalarFormatter(useOffset=False))       # show years without bias
plt.show()                                                                  # show the figure
In [4]:
### PART 1&2 - Plot the area per pixel and the variation of the number of pixels over 50% and the area covered with time

import glob                                                                 # import modules
from matplotlib.ticker import ScalarFormatter
import os

path = 'PS02-ice_data/ice_data'                                             # path of the input files
list_filenames = glob.glob(os.path.join(path,'*.npy'))                      # list of ice maps

path_areas = 'PS02-ice_data/'
data_areas = numpy.load(path_areas+'ice_data_area.npy')  

plt.figure(figsize = (8,8))                                                 # set the figure size
plt.imshow(data_areas, cmap = plt.cm.GnBu_r, origin = "lower")              # plot the data in the input file
ticks_colorbar = numpy.linspace(numpy.nanmin(data_areas),
                                numpy.nanmax(data_areas), 11)               # determine what ticks will appear 
                                                                            # in the colorbar
plt.colorbar(ticks = ticks_colorbar)                                        # add colorbar to the figure    
plt.axis('off')                                                             # take out axis
plt.title(r"Area per pixel [km$^2$]")                                       # title of the figure
plt.show()                                                                  # show the figure

date = []
time = numpy.ndarray(shape=len(list_filenames))                             # we define empty arrays of the 
pixels_over_half = numpy.ndarray(shape=len(list_filenames))                 # appropiate length
total_area_over_99 = numpy.ndarray(shape=len(list_filenames)) 

for filename in list_filenames:                                             # per each ice map
    index = list_filenames.index(filename)                                  # determine its index
    date.append(filename[len(path)+1:-4])                                   # get its date from the name
    year = int(date[index][:4]);                                            # determine the year, month and day
    month = int(date[index][4:6]);
    day = int(date[index][6:])       
    time[index] = year + (month-1)/12.0 + (day-1)/(30.0*12)                 # compute its fractional time
    
    data = numpy.load(filename)                                             # load the ice map data
    data[numpy.isnan(data)] = 0                                             # set all NaNs to zero
    pixels_over_half[index] = len(data[data > 50])                          # count the number of pixels over 50 %
    
    data[data < 99] = 0                                                     # set all data less than 99% to zero
    total_area_over_99[index] = sum(sum(data_areas*data))                   # total area = data[!=0]*area
    
plt.figure(figsize=(8,8))                                                   # set the size of the figure
plt.plot(time, pixels_over_half/10**4)                                      # plot the number of pixels vs time
plt.ylabel(r"Pixels over 50% [$\times10^4$]")                               # set the y-label
plt.xlabel("Time")                                                          # set the x-label
plt.xticks(time[time%0.25 == 0], rotation = 'vertical')                     # use as xticks times multiples of 0.25
plt.gca().xaxis.set_major_formatter(ScalarFormatter(useOffset=False))       # show years without bias

plt.figure(figsize=(8,8))                                                   # set the size of the figure
plt.plot(time, total_area_over_99/10**9)                                    # plot the number of pixels vs time
plt.ylabel(r"Total area over 99% [$\times10^9\,$km$^2$]")                   # set the y-label
plt.xlabel("Time")                                                          # set the x-label
plt.xticks(time[time%0.25 == 0], rotation = 'vertical')                     # use as xticks times multiples of 0.25
plt.gca().xaxis.set_major_formatter(ScalarFormatter(useOffset=False))       # show years without bias

plt.show()                                                                  # show the figure

Part 3 - visualizing changes over time (10 points)

Find the date at which the area of the region where the ice concentration is above 99% is the smallest. What is the value of the minimum area?

Next, read in the map for this minimum, and the map for the same day and month but from 2006. Make a side-by-side plot showing the 2006 and the 2011 data.

Compute the difference between the two maps so that a loss in ice over time will correspond to a negative value, and a gain in ice will correspond to a positive value. Make a plot of the difference, and use the RdBu colormap to highlight the changes (include a colorbar).

In [5]:
# Part 1 - Find the minimum area where the ice concentratio is above 99 % and its corresponding date
year_month_day = date[numpy.argmin(total_area_over_99)]
minimum_area = min(total_area_over_99)

print("Date and area of the minimum ice concentration above 99%: ", year_month_day, minimum_area, " km^2")

# Part 2 - Open the file corresponding to the minimum and its equivalent day on 2006 and plot them side to side
# set the filenames for both files
filename_minimum_ice_coverage = path + "/" + year_month_day + ".npy"
filename_2006_ice_coverage = path + "/2006" + year_month_day[4:] + ".npy"

# open and load both input files
data_minimum_ice_coverage = numpy.load(filename_minimum_ice_coverage)
data_2006_ice_coverage = numpy.load(filename_2006_ice_coverage)

fig = plt.figure(figsize=(10,5))                                           # create a figure of size (10,5) inches
ax1 = fig.add_subplot(121)                                                 # add the first subplot
im1 = ax1.imshow(data_2006_ice_coverage, cmap = plt.cm.GnBu_r,\
                 origin = "lower")                                         # plot the data
ax1.set_title("2006" + year_month_day[4:])                                 # add a title with the date
ax1.axis('off')                                                            # take out axis
ticks_colorbar = numpy.linspace(numpy.nanmin(data_2006_ice_coverage),\
                                numpy.nanmax(data_2006_ice_coverage), 11)  # determine what ticks will appear 
                                                                           # in the colorbar
plt.colorbar(im1, ticks = ticks_colorbar)                                  # add colorbar to the first subplot    

ax2 = fig.add_subplot(122)                                                 # add second subplot
im2 = ax2.imshow(data_minimum_ice_coverage, cmap = plt.cm.GnBu_r,\
                 origin = "lower")                                         # plot the data
ax2.set_title(year_month_day)                                              # add a title with the date
ax2.axis('off')                                                            # take out axis
ticks_colorbar = numpy.linspace(numpy.nanmin(data_minimum_ice_coverage),\
                                numpy.nanmax(data_minimum_ice_coverage), 11)  # determine what ticks will appear 
                                                                           # in the colorbar
plt.colorbar(im2, ticks = ticks_colorbar)                                  # add colorbar to the second subplot    
plt.show()                                                                 # show the figure

# Part 3: determine the difference between the two maps
data_difference_maps = data_minimum_ice_coverage - data_2006_ice_coverage
plt.figure(figsize=(7,7))                                                  # create a figure of size (7,7) inches
plt.imshow(data_difference_maps, cmap = plt.cm.RdBu, origin = "lower")     # plot the data
plt.title("Difference between maps (Negative is loss)")                    # add a title
plt.axis('off')                                                            # take out axis
ticks_colorbar = numpy.linspace(numpy.nanmin(data_difference_maps),\
                                numpy.nanmax(data_difference_maps), 11)    # determine what ticks will appear 
                                                                           # in the colorbar
plt.colorbar(ticks = ticks_colorbar)                                       # add colorbar to the first subplot    
plt.show()
Date and area of the minimum ice concentration above 99%:  20110815 72749991.2152  km^2

Part 4 - yearly averages (4 points)

Compute average ice concentration maps for 2006 and 2011, and plot them side by side.

In [6]:
# small function to quickly determine the average maps given a list of filenames
def func_average_maps(list_filenames):
    sum_data = 0                                                           # empty array
    for file in list_filenames:                                            # per each map in the list
        data = numpy.load(file)                                            # load the data
        data[numpy.isnan(data)] = 0                                        # remove the NaNs values
        sum_data += data                                                   # add them all together
    average_map = sum_data/(len(list_filenames))                           # average over the number of maps
    return average_map                                                     # return the average map

list_filenames_2011 = glob.glob(os.path.join(path,'2011*.npy'))            # list of ice maps
list_filenames_2006 = glob.glob(os.path.join(path,'2006*.npy'))            # list of ice maps

average_data_2006 = func_average_maps(list_filenames_2006)                 # determine average maps for 2006
average_data_2011 = func_average_maps(list_filenames_2011)                 # determine average maps for 2011

# plot the averages side by side
fig = plt.figure(figsize=(10,5))                                           # create a figure of size (10,5) inches
ax1 = fig.add_subplot(121)                                                 # add the first subplot
im1 = ax1.imshow(average_data_2006, cmap = plt.cm.GnBu,\
                 origin = "lower")                                         # plot the data
ax1.set_title("Average for 2006")                                          # add a title
ax1.axis('off')                                                            # take out axis
ticks_colorbar = numpy.linspace(numpy.nanmin(average_data_2006),\
                                numpy.nanmax(average_data_2006), 11)       # determine what ticks will appear 
                                                                           # in the colorbar
plt.colorbar(im1, ticks = ticks_colorbar)                                  # add colorbar to the first subplot    

ax2 = fig.add_subplot(122)                                                 # add second subplot
im2 = ax2.imshow(average_data_2011, cmap = plt.cm.GnBu,\
                 origin = "lower")                                         # plot the data
ax2.set_title("Average for 2011")                                          # add a title
ax2.axis('off')                                                            # take out axis
ticks_colorbar = numpy.linspace(numpy.nanmin(average_data_2011),\
                                numpy.nanmax(average_data_2011), 11)       # determine what ticks will appear 
                                                                           # in the colorbar
plt.colorbar(im2, ticks = ticks_colorbar)                                  # add colorbar to the second subplot    
plt.show()                                                                 # show the figure

Epilogue

The data that we have here only cover five years, so we cannot reliably extract information about long term trends. However, it is worth noting that the minimum ice coverage you found here was a record minimum - never before (in recorded history) had the size of the ice shelf been so small. This is part of a long term trend due to global warming. In 2012, the record was again beaten, and most scientists believe that by ~2050, the Arctic will be completely ice-free for at least part of the summer.