This Presentation was given at the "Recent Work in Archaeological Geophysics" conference In December 2014.
It has been a few weeks since we have posted our first geophysical challenge. We thought to create a tutorial post for the challenge to help you along! First things first, you will need to download the data for this challenge. Save this data to the same folder you will be saving your python challenge 1 script to. Now open Spyder. We have provided a step-by-step tutorial with information about each step, to help you along.
Following on from yesterday's Geolunch, here's the first Applied Geophysical ArchaeoPy challenge:
Load XYZ .CSV file and plot the data within python
You can do this in any way you want but some hints and tips are given below:
- Download the data from http://www.archaeopy.org/wp-content/uploads/2014/04/xyz.csv
- Numpy.loadtxt to load in a CSV file
- Matplotlib.mlab.griddata to grid the XYZ data
- Matplotlib.pyplot to plot the data
- If you get stuck, google is your friend!
Following on from my guide to Python for Archaeologists this is an introduction to dealing with Data in Python.
@GirlWithTrowel posted some documents about her Geophysics Data Processing steps here including some possible programming solution steps to make life easier.
Initially we'll do much of this through Command Line Programs rather than creating a pretty user interface.
Reading and Printing
Last time we printed 'Hello World'. This time we'll do something a little more useful.
The Code and data for this is available here
""" Program to Open a file and print the contents line by line """ #Sets the path and Filename that you want to Open. filename = 'data/MultipleLineText.txt' #Opens the file defined by filename, 'r' refers to the file as being opened to read f = open(filename, 'r') # *Loop* For each Line in the file 'f' does whatever is inside the loop for line in f: #The indentation means were inside the loop #Prints the line print line #Closes the file 'f' f.close()
I'm hoping the comments are detailed enough to explain what this code does
Reading and Doing Something
Suppose we have some X, Y, C1, C2 data in a comma delimited text file with a Header line and we want to calculate the mean, min, max and standard deviation.
This gives us the option to try out modules. The code and data is available here
""" Program to Open a CSV file, calculate some statistics, Deduct the mean and save the output """ #Imports the NumPY library. Giving access to lots of fast numerical librarys import numpy as np #defines a module that you can call from within your program def stats(data): #Uses Numpy to calculate statistics of data mn = np.min(data) mx = np.max(data) mean = np.mean(data) sd = np.std(data) return(mn,mx,mean,sd) #Sets the path and Filename that you want to Open. filename = 'data/randcsv.txt' #Determines the Header information by reading the first line of the file with open(filename, 'r') as f: header = f.readline() #Uses Numpy to load the entire file and split the data into an array f = np.loadtxt(filename,skiprows=1,delimiter=',') #Extracts the 3rd collumn from array f, if this is confusing collumn 1 is referred to as 0 in python c1 = f[:,2] #Extracts the 4th collumn from array f c2 = f[:,3] #Passes data to our stats module and recieves min,max,mean and standard deviation c1_stats = stats(c1) c2_stats = stats(c2) print 'statistics for column 1 are', c1_stats print 'statistics for column 2 are', c2_stats #Deducts the Mean from the collumns of data c1 = c1 - c1_stats c2 = c2 - c2_stats #Returns the zeromean values into the array f[:,2] = c1 f[:,3] = c2 #Uses Numpy to save updated array with same header as csv .txt np.savetxt('data/randvscout.txt', f, delimiter=',', header=header, fmt='%.2f')
As above i'm hoping these examples are suitably commented to understand whats going on. I encourage you to modify this code and make it do more exciting things more quickly.
Now i've given gone through enough examples to introduce the basics i'm going to start trying to make some modules and code that people can use to properly process geophysical datasets.
The 10th International conference on Archaeological Prospection in Vienna involved a small but significant discussion about Open Software. This has continued on Twitter and I've committed myself to helping people start writing OpenSource software in Python.
I've chosen the Python Programming Language, partly because its one i have a working knowledge of but more importantly because it's inherently user readable. We have to remember that primarily we're Scientists, Archaeologists & Geophysicists, not Programmers and by using Python we can concentrate on what we want to happen to our data rather than how to tell the computer what you want to happen to the data.
To Get started with Python you'll need an implementation of Python on your computer. Python is Cross-Platform (It will work on windows, linux, mac ...) but the method of installing it on different platforms can be very different.
I've traditionally used PythonXY and Spyder on Windows and Mac Systems. For the purposes of this introduction i'm going to recommend Anoconda because its entirely Cross-Platform and therefore we should all encounter the same issues at the same time.
The installation method is slightly different between different operating systems but the Anaconda Support documents are detailed enough, i think.
It's only during the Writing of this post that i've encountered Anaconda. This was very easy to install on my mac and includes 64Bit support unlike PythonXY. I'm going to try using both Anaconda and PythonXY to see if they both work seamlessly..
Keeping Track of Code Changes
I found one of the hardest and most frustrating things when i started writing code to process my Geophysical data was i changed it so much from day to day it was impossible to know how i'd created a particular dataset. I should have used a revision control system from the start but didn't so i'm going to recommend you do.
Getting started with Spyder
To start writing some code open up Spyder:
I Like to think Spyder is relatively simple but to the Non-Programmer it probably makes about as much sense as digging does to me. Hence the image below...
This looks the same on whatever Operating System your running and you'll probably find i flick between Mac, Windows and Ipython.
Writing Some Code
Well we've done the boring stuff and got everything we need installed. Now to write some code.
Create a New file in Spyder and you should see something similar to this:
# -*- coding: utf-8 -*- """ Created on Sun Jun 9 15:53:31 2013 @author: popefinn """
the """ and # denote that the text is a comment and should not be 'run' as code. On a new line underneath the MetaData Comment section type
print ("Hello World")
Run the Code by pressing F5, save the file as something sensible within your EasyMercurial repository, and accept the default runtime options. You should see the following:
You've just written your first Python program. Its not particularly useful but is good step along the way.
I'm going to write another post soon with more useful python programming (reading, doing something, saving data) and start to move my tools and libraries into ArchaeoPY. For now i'd recommend looking at the information and tutorials on SoftwareCartpentry
Last year myself and a few others from the university of Bradford went to a software carpentry workshop at Newcastle University.
The course was very informative teaching how to code in a scientific environment rather than for codings sake.
They also now run online open office hours for those who've been on the courses which offer help with any coding problem you might have.
New courses have been announced for this year including one at Manchester university. They're free to attend and I think highly valuable.
If you want to register follow this link:
Spyder is a great python development environment available through Python XY on Windows. Getting it to run on Mac OS X is a little more complicated however. You can't just download a binary image (.dmg), you need to compile the code from source.
The easiest way i found is to use MacPorts which comes with a package installer (think setup file).
Prior to installing MacPorts you need to install Apple Xcode and Xcode command line tools (Xcode Preferences - Downloads)
Run Port selfupdate
install py-spyder - This should include most required dependencies including python 2.7, numpy etc.. and because of this it takes a while
then open spyder from the terminal
As i said before most required python libraries should be installed using this method but not everything included with python xy. If you find you need a module that is not installed chances are you'll be able to find it on macports. If unsure ask and google.