Author Archives: Popefinn

ArchaeoPY: Constructing and Utilising Open Source Software for Archaeological Geophysics

This Presentation was given at the "Recent Work in Archaeological Geophysics" conference In December 2014.

Geophysical Challenge 1 - Need some help?

It has been a few weeks since we have posted our first geophysical challenge. We thought to create a tutorial post for the challenge to help you along! First things first, you will need to download the data for this challenge. Save this data to the same folder you will be saving your python challenge 1 script to. Now open Spyder. We have provided a step-by-step tutorial with information about each step, to help you along.

Continue reading

Geophysical Challenge 1

Following on from yesterday's Geolunch, here's the first Applied Geophysical ArchaeoPy challenge:

Load XYZ .CSV file and plot the data within python

You can do this in any way you want but some hints and tips are given below:

  1. Download the data from http://www.archaeopy.org/wp-content/uploads/2014/04/xyz.csv
  2. Numpy.loadtxt to load in a CSV file
  3. Matplotlib.mlab.griddata to grid the XYZ data
  4. Matplotlib.pyplot to plot the data
  5. If you get stuck, google is your friend!
Extra Credit
Its not very often that we use completely raw geophysical data. Usually we'll have to apply some processing steps to make the data more usable.
A simple initial processing step is a Zero Mean Traverse. We don't need any extra software packages to do this.
Numpy.mean can be used to calculate the mean of a group of values, or more usefully return the means of rows/columns in an array.
Numpy.subtract can be used to remove that value from rows/columns.
You might want to use loops and array indexing to do this but you shouldn't have to.
Upload your example to the ArchaeoPY Bitbucket (details here).
Prize for the most readable and well documented code.

Reading, Doing Something, Saving

Following on from my guide to Python for Archaeologists this is an introduction to dealing with Data in Python.

@GirlWithTrowel posted some documents about her Geophysics Data Processing steps here including some possible programming solution steps to make life easier.

Initially we'll do much of this through Command Line Programs rather than creating a pretty user interface.

Reading and Printing

Last time we printed 'Hello World'. This time we'll do something a little more useful.

The Code and data for this is available here

"""
Program to Open a file and print the contents line by line
"""
#Sets the path and Filename that you want to Open. 
filename = 'data/MultipleLineText.txt'

#Opens the file defined by filename, 'r' refers to the file as being opened to read
f = open(filename, 'r')
# *Loop* For each Line in the file 'f' does whatever is inside the loop
for line in f:
    #The indentation means were inside the loop
    #Prints the line
    print line
#Closes the file 'f'
f.close()

I'm hoping the comments are detailed enough to explain what this code does

Reading and Doing Something

Suppose we have some X, Y, C1, C2 data in a comma delimited text file with a Header line and we want to calculate the mean, min, max and standard deviation.

This gives us the option to try out modules. The code and data is available here

"""
Program to Open a CSV file, calculate some statistics, Deduct the mean and save the output
"""
#Imports the NumPY library. Giving access to lots of fast numerical librarys
import numpy as np

#defines a module that you can call from within your program
def stats(data):
    #Uses Numpy to calculate statistics of data
    mn = np.min(data)
    mx = np.max(data)
    mean = np.mean(data)
    sd = np.std(data)
    return(mn,mx,mean,sd)

#Sets the path and Filename that you want to Open. 
filename = 'data/randcsv.txt'

#Determines the Header information by reading the first line of the file
with open(filename, 'r') as f:
  header = f.readline()

#Uses Numpy to load the entire file and split the data into an array
f = np.loadtxt(filename,skiprows=1,delimiter=',')

#Extracts the 3rd collumn from array f, if this is confusing collumn 1 is referred to as 0 in python
c1 = f[:,2]
#Extracts the 4th collumn from array f
c2 = f[:,3]

#Passes data to our stats module and recieves min,max,mean and standard deviation
c1_stats = stats(c1)
c2_stats = stats(c2)

print 'statistics for column 1 are', c1_stats
print 'statistics for column 2 are', c2_stats

#Deducts the Mean from the collumns of data
c1 = c1 - c1_stats[2]
c2 = c2 - c2_stats[2]

#Returns the zeromean values into the array
f[:,2] = c1
f[:,3] = c2

#Uses Numpy to save updated array  with same header as csv .txt
np.savetxt('data/randvscout.txt', f, delimiter=',', header=header, fmt='%.2f')

As above i'm hoping these examples are suitably commented to understand whats going on. I encourage you to modify this code and make it do more exciting things more quickly.

Now i've given gone through enough examples to introduce the basics i'm going to start trying to make some modules and code that people can use to properly process geophysical datasets.

Getting started with Python for Archaeology

The 10th International conference on Archaeological Prospection in Vienna involved a small but significant discussion about Open Software. This has continued on Twitter and I've committed myself to helping people start writing OpenSource software in Python.

I've chosen the Python Programming Language, partly because its one i have a working knowledge of but more importantly because it's inherently user readable. We have to remember that primarily we're Scientists, Archaeologists & Geophysicists, not Programmers and by using Python we can concentrate on what we want to happen to our data rather than how to tell the computer what you want to happen to the data.

Installing Python

To Get started with Python you'll need an implementation of Python on your computer. Python is Cross-Platform (It will work on windows, linux, mac ...) but the method of installing it on different platforms can be very different.

I've traditionally used PythonXY and Spyder on Windows and Mac Systems. For the purposes of this introduction i'm going to recommend Anoconda because its entirely Cross-Platform and therefore we should all encounter the same issues at the same time.

The installation method is slightly different between different operating systems but the Anaconda Support documents are detailed enough, i think.

If you dont want to use Anaconda PythonXY or installing Spyder and Python from scratch are suitable alternatives.

It's only during the Writing of this post that i've encountered Anaconda. This was very easy to install on my mac and includes 64Bit support unlike PythonXY. I'm going to try using both Anaconda and PythonXY to see if they both work seamlessly..

Keeping Track of Code Changes

I found one of the hardest and most frustrating things when i started writing code to process my Geophysical data was i changed it so much from day to day it was impossible to know how i'd created a particular dataset. I should have used a revision control system from the start but didn't so i'm going to recommend you do.

I've chosen to use EasyMercurial because its the one favoured by SoftwareCarpentry and completely cross platform, some instructions are available here.

Getting started with Spyder

To start writing some code open up Spyder:

OpenSpyderMac

Opening Spyder on Mac OSX

OpenSpyderWIN

Opening Spyder on Windows

 

 

 

 

 

 

I Like to think Spyder is relatively simple but to the Non-Programmer it probably makes about as much sense as digging does to me. Hence the image below...

Using Spyder

This looks the same on whatever Operating System your running and you'll probably find i flick between Mac, Windows and Ipython.

Writing Some Code

Well we've done the boring stuff and got everything we need installed. Now to write some code.

Create a New file in Spyder and you should see something similar to this:

# -*- coding: utf-8 -*-
"""
Created on Sun Jun  9 15:53:31 2013

@author: popefinn
"""

the """ and # denote that the text is a comment and should not be 'run' as code. On a new line underneath the MetaData Comment section type

print ("Hello World")

Run the Code by pressing F5, save the file as something sensible within your EasyMercurial repository, and accept the default runtime options. You should see the following:

HelloWorld

 

You've just written your first Python program. Its not particularly useful but is good step along the way.

I'm going to write another post soon with more useful python programming (reading, doing something, saving data) and start to move my tools and libraries into ArchaeoPY. For now i'd recommend looking at the information and tutorials on SoftwareCartpentry

Software Carpentry

Last year myself and a few others from the university of Bradford went to a software carpentry workshop at Newcastle University.

The course was very informative teaching how to code in a scientific environment rather than for codings sake.

They also now run online open office hours for those who've been on the courses which offer help with any coding problem you might have.

New courses have been announced for this year including one at Manchester university. They're free to attend and I think highly valuable.

If you want to register follow this link:

http://software-carpentry.org/blog/2013/02/a-bunch-of-bootcamps.html

Spyder on Mac OSX

Spyder is a great python development environment available through Python XY on Windows. Getting it to run on Mac OS X is a little more complicated however. You can't just download a binary image (.dmg), you need to compile the code from source.

The easiest way i found is to use MacPorts which comes with a package installer (think setup file).

Prior to installing MacPorts you need to install Apple Xcode and Xcode command line tools (Xcode Preferences - Downloads)

Once Macports is installed you need to open a TerminalMac TerminalMac Terminal 2

Run Port selfupdate

Sudo Port SelfUpdate

 

install py-spyder - This should include most required dependencies including python 2.7, numpy etc.. and because of this it takes a while

install py-spyder

 

then open spyder from the terminal

Open SpyderSpyder Running

 

As i said before most required python libraries should be installed using this method but not everything included with python xy. If you find you need a module that is not installed chances are you'll be able to find it on macports. If unsure ask and google.