Geophysical Challenge 1 - Need some help?

It has been a few weeks since we have posted our first geophysical challenge. We thought to create a tutorial post for the challenge to help you along! First things first, you will need to download the data for this challenge. Save this data to the same folder you will be saving your python challenge 1 script to. Now open Spyder. We have provided a step-by-step tutorial with information about each step, to help you along.

Step 1: Importing numpy.

Starting With Spyder

"""
Archaeopy Geophysical Challenge 1

A Program to Load XYZ Data, Grid the Data and display in Matplotlib

# We import numpy as np to ake our code more readable 
# and maintain consistency with other python users
import numpy as np
# we can then run functions from numpy by typing np.

Step 2: Loading data:

#To load the data use numpy.loadtxt

data = np.loadtxt('xyz.csv', dtype=float, delimiter=',', skiprows=1)

# defines the loadtxt parameters needed to import the data and loads the data in. 
# Consult the loadtxt definition for all the optional parameters you can use. 
# The parameters we are concerned with are the file name 'xyz.csv', data type 
# (float) --although this is option since the default type is float, 
# the delimiter (comma, for a csv) ',' And the skiprows, because the first row 
# is x,y,z and not float values since numpy arrays must consist of the same datatype.

Numpy handles data in arrays. If you are unfamiliar with arrays, please have a read through this numpy readme file.  In the meantime, we will provide you with a brief overview of arrays within numpy. Since this challenge uses an xyz csv file, we will understand our numpy array through an excel spreadsheet. For instance, the image below shows the csv of xyz values we are loading as part of this challenge (rows 4-11,833 are hidden):

Screen Shot 2014-04-18 at 15.31.04

Each column represents a measurement (z) at position (x, y). The A, B, and C columns hold the x, y, and z values, respectively. Remember when we load that data using loadtxt and told it to skip row 1? Because numpy requires the array to contain data all of the same type, we have to tell it to skip row 1, since we are working with floats. Here is the numpy array we have created when we load the csv file:

Numpy Array

Looks pretty similar to the spreadsheet!

Step 3: Gridding data.

# to interpolate the data we need to define points to interpolate to
# for this we'll use numpy again
# in spyder you'll be able to see the linspace options in the top right
xi = np.linspace(0,46,47)
yi = np.linspace(0,60,244)

# We can automate the ranges for linspace using numpy min & max functions.
# This is a good extension to this challenge - bonus points

zi = griddata(data[:,0], data[:,1], data[:,2], xi, yi)

Griddata requires the following parameters:

griddata(x, y, z, xi, yi, interp='nn'

We don't have these variables defined outside of a single numpy array; instead of explicitly defining x, y & z we can use numpy array indexing to extract the required data.

In the spreadsheet example above,

data[:,0]

refers to all the rows (:) in column A (0), which in this case are the X values.

Step 4: Displaying the Data

#We can also import individual functions in the following way
from matplotlib.mlab import griddata

# This reduces our memory footprint and makes the code faster because 
# we only load functions that are required
# We can run this function using griddata

# finally we can combine these two techniques and import one function from matplotlib
# and use a shortened function name for readability
import matplotlib.pyplot as plt

im = plt.imshow(zi, cmap=plt.cm.Greys)
plt.colorbar()
plt.show()

This should produce an image like below:

im.show

With an extra few lines of code you can make the image more usable. We wont give instructions on this, think of it as extra credit.

Matplotlib XYZ Data