# Georectifying archaeological field drawings using only 2 ground control points

The title says it all: how to georectify archaeological field drawings in a GIS system when you only have two ground control points (GCP).

This is a problem that I can imagine surfaces also in other projects than only those I’m involved with. In general, when doing field drawings on an archaeological excavation, two known points, or GCPs, are enough to position the drawing exactly within the local coordinate system. The extent covered by a single drawing is usually some square meters, and even in the very unlikely case the local coordinate system would be in some other projection than purely rectangular, the errors caused by importing the plan to a GIS using only two points are minimal compared to the errors made while drawing anyway, that to use more than two points is only necessitated by the needs of a large drawing to have other fixed references to cover the whole drawing area.

On the other hand, a GIS system, which otherwise is very good in organizing the measurement data and the drawings of an excavation, often has a georeferencing system that is based on a minimum of three GCP’s. This is (probably) because the most commonly used forms of affine transformations in principle call for three GCP’s, and in many cases, the GIS includes options to provide way much more GCP’s than three. The reason for this is twofold: firstly, the use of more than the necessary amount of GCP’s allows for the use of statistical means to estimate errors; secondly, the use of more points allows the use of other than linear transformations, ranging from second grade polynomials to “rubber sheeting”, very useful for historical and otherwise distorted maps, that can be georeferenced using an array of points distributed evenly over the whole map area.

However, the problem discussed in this post is at the other end of the scale: exact drawings covering a small area, often in scales between 1:10 and 1:50 – although 1:1 drawings of especially interesting details are not unknown.

Let us assume the we have such a drawing, with two known control points with the coordinates in the local coordinate system. The problem is, how to import this drawing into a GIS? A scanned drawing can be imported, and a georectifying interface, like g.gui.gcp in GRASS, can be used to mark the GCP’s. In GRASS, this can be made using the imported drawing so, that the GCP’s are marked on the original drawing, and the corresponding coordinates for the local coordinate system entered on the list of GCP’s visible on the window. However, to perform the rectification, one needs the enter at least three coordinates. The third coordinate can be simulated, for example drawing additional elements both in the original drawing location as well as the target location, i.e. the location where the imported drawing is to be placed. However, this process is rather complicated and slow. (If the drawing capabilities were at the level of AutoCAD, it would be much speedier, as one could add elements based on the geography of the existing elements, i.e. the GCP’s.)

One, simple solution is to use an Octave/MATLAB script like below to calculate a third common point between the source and target location:

movingpts = [ 150.470182741 166.401972978; 143.499307479 1233.29837752 ]
targetpts = [ 5148.02 5008.54; 5149.76 5009.00 ]
newpoint = [1500 1500]
t = cp2tform (movingpts,targetpts,'nonreflective similarity')
newpoint2 = tformfwd(t,newpoint)

For this script, you need the Octave image library, which contains the cp2tform and tformfwd functions.

The first line puts the coordinates of the GCP’s on the original drawing into the variable movingpts. The second line puts the GCP coordinates in the local, target coordinate system to targetpts. The third line adds a third point in the drawing to newpoints– this coordinate can be anything, as it does not refer to any existing point on the drawing itself. The fourth line created the transformation matrix; important here is the third parameter 'nonreflective similarity', which defines the transformation type. The ‘nonreflective similarity’ in practice means, that the only allowed operations are rotation, scaling and translation, which is exactly what we need for the archaeological field drawings.

On the fifth line, the coordinate for the third point is transformed to the target coordinates using the defined transformation matrix. The result in this case is 5150.766081868117 5006.89610151436.

And not we have the necessary coordinates for defining the third GCP: 1500 1500; 5150.77 5006.90.

A careful person integrates this whole process as a part of an Emacs Org-mode file using Babel for reproducible results. It is always good to document where you got the numbers you were using.

# Making org-protocol work again

The move from Gnome 2 to Gnome 3 breaks a previously working setup of org-protocol. Otherwise the instructions on the org-protocol page are still valid, but the Gnome integration part has changed. Previously, you were supposed to run these commands:

gconftool-2 -s /desktop/gnome/url-handlers/org-protocol/command '/usr/local/bin/emacsclient %s' --type String
gconftool-2 -s /desktop/gnome/url-handlers/org-protocol/enabled --type Boolean true

The Gnome system has given up on the protocol handler thing altogether. To get similar results, you’re supposed to use Mime types. This page shows how it is done; the only difference to how it was done previously is, that you should add a file called “org-protocol.desktop” to “~/.local/share/applications/”. These contents work:

[Desktop Entry]
Name=org-protocol
Exec=emacsclient %u
Type=Application
Terminal=false
Categories=System;
MimeType=x-scheme-handler/org-protocol;

After creating this file, you should run update-desktop-database ~/.local/share/applications/.

This was the only change needed to restore the org-protocol-functionality on my system after an update from Debian Squeeze to Wheezy.

# Changing shortcut keys in Gnome3 applications (evince etc.)

Tags

A change in Gnome3: the old way of assigning keyboard shortcuts in applications does not work by default, but the change is small. Yet again, it is a question of enabing the correct option. Open dconf-editor. Open group “org→gnome→desktop→interface”. Find key “can-change-accels” and put a mark beside it. Close the program.

Now you can, for example, open Evince and test this. Open a document, too. Then, open the menu “View” and move the mouse over “Best fit” so that it changes color. Do not press the mouse button. Instead, press “Shift+Z”, and you’ll see “Shift+Z” appear beside the text “Best fit”. Move the mouse away from the menu and close the meny by clicking somewhere else. Now you can test your new shortcut by pressing “Shift+Z”, and the document should be zoomed so that the whole page is shown.

This should work in other Gnome programs, too.

# DjVu-files, contents free and tables of

Tags

In the last months some people close to me have heard me praising the democratization of academic research. This is not what generally appears to be the case, however, and in truth, I have to admit that this process is not a general trend but a side product of the proliferation of internet resources, often provided by academic institutions or voluntary organizations. One of these is Archive.org, a site that collects copyright free digital versions of almost everything.

How can this have any democratizing effect on anything whatsoever? In most fields of research, probably it does not. The research that ends up being there is already old, from the beginning of the century, and there are very few fields where that kind of research has any relevance today. I’m happily working in a area of academic research, history, where things do not get old that fast, and even in a sub-field of history called Ancient History, more specifically Classical History, where most of the most important reference collections were begun in the 19th century. Now, not all of these appear on archive.org (Sorry, no CIL there yet! But the Oxyrhyncus papyri are there.), but there is a lot of interesting stuff around there. The democratizing process comes from the fact that most of this stuff is very hard-to-get on print these days. The books were sold out long ago, and before the last ten years or so, the only way to consult them was to make a trip to a library that had already acquired them, hundred years ago or so. This mean, that to even see the important classic studies that even today form a background of the research, or to use some of the still valid manuals and reference collections, you either had to work in a university with such resources or then it was some other subject to study for you.

How does this relate to djvu files, you may be asking? Well, the digitized, old books are offered in many different formats, of which the modern e-book formats are usually useless. PDF is somewhat better, but PDF just isn’t made for this, nor are the readers. Technically, the best format is DJVU, with different raster layers, hidden text for OCR and everything. The files are smaller (as if that did matter these days), and the readers much faster.

The selection of readers is small, though. DjView4 seems most common, and it performs very well in that function, but it has the problem that you cannot add annotation nor bookmarks, which is a nuisance, since the digitized books are often hundreds of pages long, and the versions offered by archive.org do have no such metadata associated with them. A good solution is the program djvusmooth, which can create bookmarks and some other types of metadata, too, and also save these changes to the files. And then you can make your contents and bookmarks for Djview4.

# Using XSL to convert docx to LaTeX

Tags

The sudden realization that the new MS Word format, .docx, is called Office Open XML for a reason made me spend the whole day in trying to figure out, how these XSL-transformations actually work and whether they could be used in converting these new .docx files to something more edi(ta)ble.

Turned out that the XSL transformations were in principle a pretty simple thing to do, just like a friend me had told. Here’s and example of how to convert a .docx file to LaTeX, in its crudes form:

First, you need to break open the .docx file. It basically is a simple zipped archive, so an ‘unzip testdoc.docx’ should do the trick; you’ll end up with several files and sub-directories, of which only the directory called ‘word’ is necessary for this test.

Second, here’s the XSL transformation to save in a file:

﻿<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">

<xsl:template match="/w:document">
\documentclass{article}
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="w:body">
\begin{document}
<xsl:apply-templates/>
\end{document}
</xsl:template>

<xsl:template match="w:p">
<xsl:apply-templates/><xsl:if test="position()!=last()"><xsl:text>

</xsl:text></xsl:if>
</xsl:template>

<xsl:template match="w:r">
<xsl:if test="w:footnoteReference"><xsl:text>\footnote{</xsl:text>
<xsl:call-template name="footnote">
<xsl:with-param name="fid"><xsl:value-of select="//@w:id"/></xsl:with-param>
</xsl:call-template>
<xsl:text>}</xsl:text>
</xsl:if>
<xsl:if test="w:rPr/w:b"><xsl:text>\textbf{</xsl:text></xsl:if>
<xsl:call-template name="pastb"/>
<xsl:if test="w:rPr/w:b"><xsl:text>}</xsl:text></xsl:if>
</xsl:template>

<xsl:template name="pastb">
<xsl:if test="w:rPr/w:i"><xsl:text>\textit{</xsl:text></xsl:if>
<xsl:call-template name="pasti"/>
<xsl:if test="w:rPr/w:i"><xsl:text>}</xsl:text></xsl:if>
</xsl:template>

<xsl:template name="pasti">
<xsl:apply-templates select="w:t"/>
</xsl:template>

<xsl:template name="footnote">
<xsl:param name="fid"/>

A very short explanation follows.

On line 1, the necessary base library module is imported

On lines 3 and 4, a small function from two integers to one integer is defined

On line 6, it is shown, how the applicative functors are used to apply the function on Maybe- values.

# Archives of European Sociology

This time I’ll be writing something more connected to history than computers, but it has pretty much to do with modern technology, anyway.

If you do a Google search for the journal title “Archives of European Sociology”, you’ll get a long list of citations to a journal by that name, like “Brubaker, Rogers, Ethnicity without Groups, Archives of European Sociology, 18, 2 2002”. Now, my wife, who happens to be a historian, tried to find this journal, as she wanted to see an article published there. Based on the huge amount of citations to the journal found by Google, she of course assumed, that the journal was well-known, widely distributed, and logically, available at the local university library. To her big surprise, she was not able to find the journal in any library database — the closest match was the Archives européennes de sociologie, and international publication, that also had an english title (European Journal of Sociology) and a German title (Europäisches Archiv für Soziologie). That obviously could not be it, as the Journals home page at the publishers site very clearly stated the title in all three languages.

But, in the end, a comparison of the references to the mysterious journal with the table of contents -data at the publishers site did show, that this was, after all, the mysterious Archives of European Sociology. Why on earth was it always referred to under this name, when the publisher, and the journal itself, very clearly used the english title European Journal of Sociology? The thing remained a mystery, until today, when she found one potential explanation for this misnomer:

More and more, the scientific journals have been adopting the convention, that the headers and footers of the pages include, in addition to the page number, the authors name and a part of the title, also a reference to the journal itself: the name of the journal, the year, volume and number of the current issue, and the pages covered by the article in question; most often this information appears in the footer of the first page of each article. This is a wonderful habit, as it saves the nerves of so many academics fervently copying articles and trying to sort the piles of copies later. The Archives européennes de sociologie had adopted this policy already in the yearly 1980’s. And you know what? They did not want to print the whole name of the journal in the footer, as it was rather long, and would have forced the footer to extend to the second line; instead, they used an abbreviation — also a venerable habit. The abbreviation was Arch. europ. sociol.

One can just imagine, how the academic sorting his piles of photocopies finds this interesting article he did remember having somewhere, is convinced of its value, and decided to cite it in his/her next work. But where did the article come from? Luckily the necessary information is included in the photocopy itself: “Arch. europ. sociol.” Now, if you’re mother tongue is English, and you have this article written in English from a journal, the name of which is abbreviated thus, what is the logical English name that can be constructed from that: “Archives of European Sociology”.