On most excavations the large number of stratigraphic units and contexts makes it necessary to use some sort of representation of the relative chronological sequence to keep track of what has already been excavated (not to mention building archaeology). This tool is the Harris Matrix.
It can be defined as a directed graph from the most recent down to the older deposits, where the nodes represent layers, that are connected through stratigraphic relations (edges). This year in Gortyna I tried to use the Graphviz software for automating the creation of the Harris Matrix for the excavation area I was in.
UPDATE: I’ve published a first draft of a simple application to automate the generation of the Harris Matrix. Read more here.
There are two steps involved here:
- keeping the stratigraphic information stored in some way
- processing information to obtain the graph
For installing Graphviz, you should follow the instructions for the
operating system you are using: it is straightforward in most
cases. The most important thing you ought to know before diving into
this brief tutorial is that GV is not a GUI application, i.e. you
don’t draw your graph, but rather you describe it in a text
file. The file is then processed using one of the many programs that
Graphviz If you can’t live without buttons and menus, try Dia, that can export also in Graphviz
format. Dia is a GTK+ based diagram creation program for Linux, Unix
and Windows released under the GPL license.
Graphviz has its own native, plain text format, that is documented on
the website. Graphviz
.dot files can be read and written with any text
editor like kate, gedit, jedit or notepad++. Keeping a file of this
kind is the obvious choice for an experiment, but of course the
single-file approach has also lot of problems.
This is a sample from the final
.dot file I had compiled during the excavation weeks:
Apart from the initial preamble, it’s a ridiculously easy syntax. The
Harris Matrix is to be read top-down, so i.e.
A -> B
means “A comes after B”. You can also
concatenate multiple relations on the same row. Indenting is not
mandatory, but it helps keeping your file clean. You can write
comments on any line after a
# character, like
It’s not that difficult to keep this file updated by hand, really. One thing you could worry about are redundant relations that could for sure make your graph ugly and unreadable. But this is about automation, so this isn’t going to be a problem: we’ll be recording each relation, even the useless ones.
We said at the beginning that the Harris Matrix is a directed
graph. Graphviz comes with a lot of tools, but only one does what
we need, and it’s named
dot. From the command line we
can just run
dot harris-matrix.dot -Tpng -o harris-matrix.png
and get in zero seconds our data compiled as a graph. The
command line option specifies which one of the many available output
formats we want to get. The
-o flag (that is, option) precedes the
So far, the result is quite good. But redundant relations are still there, and I promised it wouldn’t be a problem at all.
Here’s when the power of UNIX comes in help.
another of the many tools provided by Graphviz, that acts as a
“transitive reduction filter for directed graphs”. So, it
has to run before
dot reads the input file. A
pipe (represented by the
character) is the easiest way to pass data from one program to another
in UNIX style. Here’s how I did it:
tred harris-matrix.dot | dot -Tpng -o harris-matrix-tred.png
dot by default accepts input from stdin, while
default uses stdout as output. Many simple programs that do one
single operation, well done: this is the core of the UNIX philosophy,
and Graphviz follows it. Once you understand this concept, things will
be much easier. The output of this second command is slightly
different from the first one:
You can play around with some general options to change the graphic layout of your graph. These are two options I often use to get better looking Harris Matrices:
That’s enough for now. In the next tutorial, we’ll go further, using Graphviz as a programming library through Python. This means that we won’t need anymore to enter manually the relations, we will have a GUI, and our data will be stored in a database.