Harris Matrix Data Package

WARNING: This is a draft specification and still under development. If you have comments or suggestions please file them in the issue tracker.

Harris Matrix Data Package is a lightweight and user-oriented format for publishing and consuming archaeological stratigraphy data. Harris Matrix data packages are made of simple and universal components. They can be produced from ordinary spreadsheet or database software and used in any environment.

Author Stefano Costa
Created 2018-12-04
Updated 2022-10-30
JSON Schema (not yet ready)
Version 0.2

Language

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.

Changelog

  • 0.2: more precise correspondence with the Data Package specification
  • 0.1: first formal specification

Introduction

This document contains the “Harris Matrix Data Package” specification - a lightweight and platform-agnostic format for publishing, archiving and consuming archaeological stratigraphy data.

Harris Matrix Data Package doesn’t reivent the wheel and builds on two separate and well defined specifications, bringing them together:

  1. the CSV table schema developed by Thomas S. Dye for the hm Lisp package
  2. the JSON metadata descriptor of Tabular Data Package from Frictionless Standards

Glossary

The following definitions apply in the context of the Harris Matrix Data Package specification:

  • data descriptor is a JSON file, named datapackage.json, that is-found in the top-level directory of a data package, and contains metadata about the entire data package (name, description, creation date, author names, references) together with the data package schema
  • each resource is a CSV table
  • contexts refer to archaeological stratigraphy units as produced by the single context recording method; contexts can be both positive and negative and are described in terms of unit-type and position
  • observations refer to the stratigraphic relationship between pairs of contexts, and can only record relative chronology of earlier-later relationships
  • inferences refer to once-equal contexts that are recorded separately but treated as a whole for the purpose of stratigraphy, as is the case of a floor level that was divided in two separate units by a later trench
  • phases and periods are groupings of contexts that are based on chronological affinity
  • events are associations between absolute chronology events and contexts and the resource specifies the nature of the association using terms introduced to archaeology by Jeffrey S. Dean in an essay entitled “Independent dating in archaeological analysis” published in Advances in Archaeological Method and Theory in 1978.

Specification

Harris Matrix Data Package builds directly on the Tabular Data Package specification. Thus a Harris Matrix Data Package MUST be a Tabular Data Package and conform to the Tabular Data Package specification.

Harris Matrix Data Package has the following requirements over and above those imposed by Tabular Data Package:

  • a Harris Matrix Data Package MUST be a valid Tabular Data Package
  • a Harris Matrix Data Package MUST contain the name, title, profile, contributors and created properties
  • the value of the profile property MUST be https://www.iosa.it/software/harris-matrix/harris-matrix-data-package.json
  • a Harris Matrix Data Package SHOULD contain the description, licenses, id, version and keywords properties
  • there MUST be at least two resource items in the resources array, named contexts and observations
  • there MAY be a resource named inferences
  • there MAY be a resource named periods
  • there MAY be a resource named phases
  • there MAY be a resource named events
  • there MAY be a resource named event-order

Examples

Resource names are standardized so that the data descriptor can remain largely untouched, except for the specific metadata.

{
  "name": "harris-matrix-fig12",
  "title": "Principles of Archaeological Stratigraphy, fig. 12",
  "profile": "https://www.iosa.it/software/harris-matrix/harris-matrix-data-package.json",
  "contributors": [
    {
      "title": "Thomas S. Dye",
      "role": "author"
    },
    {
      "title": "Stefano Costa",
      "role": "contributor"
    }
  ],
  "created": "2018-12-04",
  "sources": [
    {
      "title": "Principles of Archaeological Stratigraphy",
      "path": "https://www.worldcat.org/it/title/613969586"
    }
  ],
  "resources": [
    ...
  ]
}

Implementations

The only known implementation is the Python hmdp tool.

The hm Lisp package will work with a well formed Harris Matrix Data Package, but it will ignore the JSON data descriptor and the metadata.