Georectifying archaeological field drawings using only 2 ground control points


, ,

The title says it all: how to georectify archaeological field drawings in a GIS system when you only have two ground control points (GCP).

This is a problem that I can imagine surfaces also in other projects than only those I’m involved with. In general, when doing field drawings on an archaeological excavation, two known points, or GCPs, are enough to position the drawing exactly within the local coordinate system. The extent covered by a single drawing is usually some square meters, and even in the very unlikely case the local coordinate system would be in some other projection than purely rectangular, the errors caused by importing the plan to a GIS using only two points are minimal compared to the errors made while drawing anyway, that to use more than two points is only necessitated by the needs of a large drawing to have other fixed references to cover the whole drawing area.

On the other hand, a GIS system, which otherwise is very good in organizing the measurement data and the drawings of an excavation, often has a georeferencing system that is based on a minimum of three GCP’s. This is (probably) because the most commonly used forms of affine transformations in principle call for three GCP’s, and in many cases, the GIS includes options to provide way much more GCP’s than three. The reason for this is twofold: firstly, the use of more than the necessary amount of GCP’s allows for the use of statistical means to estimate errors; secondly, the use of more points allows the use of other than linear transformations, ranging from second grade polynomials to “rubber sheeting”, very useful for historical and otherwise distorted maps, that can be georeferenced using an array of points distributed evenly over the whole map area.

However, the problem discussed in this post is at the other end of the scale: exact drawings covering a small area, often in scales between 1:10 and 1:50 – although 1:1 drawings of especially interesting details are not unknown.

Let us assume the we have such a drawing, with two known control points with the coordinates in the local coordinate system. The problem is, how to import this drawing into a GIS? A scanned drawing can be imported, and a georectifying interface, like g.gui.gcp in GRASS, can be used to mark the GCP’s. In GRASS, this can be made using the imported drawing so, that the GCP’s are marked on the original drawing, and the corresponding coordinates for the local coordinate system entered on the list of GCP’s visible on the window. However, to perform the rectification, one needs the enter at least three coordinates. The third coordinate can be simulated, for example drawing additional elements both in the original drawing location as well as the target location, i.e. the location where the imported drawing is to be placed. However, this process is rather complicated and slow. (If the drawing capabilities were at the level of AutoCAD, it would be much speedier, as one could add elements based on the geography of the existing elements, i.e. the GCP’s.)

One, simple solution is to use an Octave/MATLAB script like below to calculate a third common point between the source and target location:

movingpts = [ 150.470182741 166.401972978; 143.499307479 1233.29837752 ]
targetpts = [ 5148.02 5008.54; 5149.76 5009.00 ]
newpoint = [1500 1500]
t = cp2tform (movingpts,targetpts,'nonreflective similarity')
newpoint2 = tformfwd(t,newpoint)

For this script, you need the Octave image library, which contains the cp2tform and tformfwd functions.

The first line puts the coordinates of the GCP’s on the original drawing into the variable movingpts. The second line puts the GCP coordinates in the local, target coordinate system to targetpts. The third line adds a third point in the drawing to newpoints– this coordinate can be anything, as it does not refer to any existing point on the drawing itself. The fourth line created the transformation matrix; important here is the third parameter 'nonreflective similarity', which defines the transformation type. The ‘nonreflective similarity’ in practice means, that the only allowed operations are rotation, scaling and translation, which is exactly what we need for the archaeological field drawings.

On the fifth line, the coordinate for the third point is transformed to the target coordinates using the defined transformation matrix. The result in this case is 5150.766081868117 5006.89610151436.

And not we have the necessary coordinates for defining the third GCP: 1500 1500; 5150.77 5006.90.

A careful person integrates this whole process as a part of an Emacs Org-mode file using Babel for reproducible results. It is always good to document where you got the numbers you were using.


Making org-protocol work again


, ,

The move from Gnome 2 to Gnome 3 breaks a previously working setup of org-protocol. Otherwise the instructions on the org-protocol page are still valid, but the Gnome integration part has changed. Previously, you were supposed to run these commands:

gconftool-2 -s /desktop/gnome/url-handlers/org-protocol/command '/usr/local/bin/emacsclient %s' --type String
gconftool-2 -s /desktop/gnome/url-handlers/org-protocol/enabled --type Boolean true

The Gnome system has given up on the protocol handler thing altogether. To get similar results, you’re supposed to use Mime types. This page shows how it is done; the only difference to how it was done previously is, that you should add a file called “org-protocol.desktop” to “~/.local/share/applications/”. These contents work:

[Desktop Entry]
Exec=emacsclient %u

After creating this file, you should run update-desktop-database ~/.local/share/applications/.

This was the only change needed to restore the org-protocol-functionality on my system after an update from Debian Squeeze to Wheezy.

Changing shortcut keys in Gnome3 applications (evince etc.)


, ,

A change in Gnome3: the old way of assigning keyboard shortcuts in applications does not work by default, but the change is small. Yet again, it is a question of enabing the correct option. Open dconf-editor. Open group “org→gnome→desktop→interface”. Find key “can-change-accels” and put a mark beside it. Close the program.

Now you can, for example, open Evince and test this. Open a document, too. Then, open the menu “View” and move the mouse over “Best fit” so that it changes color. Do not press the mouse button. Instead, press “Shift+Z”, and you’ll see “Shift+Z” appear beside the text “Best fit”. Move the mouse away from the menu and close the meny by clicking somewhere else. Now you can test your new shortcut by pressing “Shift+Z”, and the document should be zoomed so that the whole page is shown.

This should work in other Gnome programs, too.

DjVu-files, contents free and tables of


, ,

In the last months some people close to me have heard me praising the democratization of academic research. This is not what generally appears to be the case, however, and in truth, I have to admit that this process is not a general trend but a side product of the proliferation of internet resources, often provided by academic institutions or voluntary organizations. One of these is, a site that collects copyright free digital versions of almost everything.

How can this have any democratizing effect on anything whatsoever? In most fields of research, probably it does not. The research that ends up being there is already old, from the beginning of the century, and there are very few fields where that kind of research has any relevance today. I’m happily working in a area of academic research, history, where things do not get old that fast, and even in a sub-field of history called Ancient History, more specifically Classical History, where most of the most important reference collections were begun in the 19th century. Now, not all of these appear on (Sorry, no CIL there yet! But the Oxyrhyncus papyri are there.), but there is a lot of interesting stuff around there. The democratizing process comes from the fact that most of this stuff is very hard-to-get on print these days. The books were sold out long ago, and before the last ten years or so, the only way to consult them was to make a trip to a library that had already acquired them, hundred years ago or so. This mean, that to even see the important classic studies that even today form a background of the research, or to use some of the still valid manuals and reference collections, you either had to work in a university with such resources or then it was some other subject to study for you.

How does this relate to djvu files, you may be asking? Well, the digitized, old books are offered in many different formats, of which the modern e-book formats are usually useless. PDF is somewhat better, but PDF just isn’t made for this, nor are the readers. Technically, the best format is DJVU, with different raster layers, hidden text for OCR and everything. The files are smaller (as if that did matter these days), and the readers much faster.

The selection of readers is small, though. DjView4 seems most common, and it performs very well in that function, but it has the problem that you cannot add annotation nor bookmarks, which is a nuisance, since the digitized books are often hundreds of pages long, and the versions offered by do have no such metadata associated with them. A good solution is the program djvusmooth, which can create bookmarks and some other types of metadata, too, and also save these changes to the files. And then you can make your contents and bookmarks for Djview4.

Using XSL to convert docx to LaTeX


, , , , ,

The sudden realization that the new MS Word format, .docx, is called Office Open XML for a reason made me spend the whole day in trying to figure out, how these XSL-transformations actually work and whether they could be used in converting these new .docx files to something more edi(ta)ble.

Turned out that the XSL transformations were in principle a pretty simple thing to do, just like a friend me had told. Here’s and example of how to convert a .docx file to LaTeX, in its crudes form:

First, you need to break open the .docx file. It basically is a simple zipped archive, so an ‘unzip testdoc.docx’ should do the trick; you’ll end up with several files and sub-directories, of which only the directory called ‘word’ is necessary for this test.

Second, here’s the XSL transformation to save in a file:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl=""

<xsl:template match="/w:document">

<xsl:template match="w:body">

<xsl:template match="w:p">
<xsl:apply-templates/><xsl:if test="position()!=last()"><xsl:text>


<xsl:template match="w:r">
 <xsl:if test="w:footnoteReference"><xsl:text>\footnote{</xsl:text>
 <xsl:call-template name="footnote">
 <xsl:with-param name="fid"><xsl:value-of select="//@w:id"/></xsl:with-param>
 <xsl:if test="w:rPr/w:b"><xsl:text>\textbf{</xsl:text></xsl:if>
 <xsl:call-template name="pastb"/>
 <xsl:if test="w:rPr/w:b"><xsl:text>}</xsl:text></xsl:if>

<xsl:template name="pastb">
 <xsl:if test="w:rPr/w:i"><xsl:text>\textit{</xsl:text></xsl:if>
 <xsl:call-template name="pasti"/>
 <xsl:if test="w:rPr/w:i"><xsl:text>}</xsl:text></xsl:if>

<xsl:template name="pasti">
 <xsl:apply-templates select="w:t"/>

<xsl:template name="footnote">
 <xsl:param name="fid"/>
 <xsl:apply-templates select="document('footnotes.xml')/w:footnotes/w:footnote[@w:id=$fid]"/>

<xsl:template match="//w:footnote">
 <xsl:apply-templates select="w:p"/>


You can save that in a file called docxtolatex.xsl in the ‘word’ directory. Then, in that directory, run ‘xsltproc docxtolatex.xsl document.xml’, and you’ll have your screen full of the document, in LaTeX markup.

You’ll notice, that this XSLT only converts bold, italics and footnotes. But then again, that’s what I often only need to convert…

How to construct a collection of articles with LaTeX


I had a need to edit a collection of articles and to turn it into a book. This is not a use case covered by any of the standard LaTeX classes, so I did look for some other options. The class combine seemed to provide what I needed, but in the end, turned out to be too limiting. While seemingly offering all one could ask, it at the same time actually limited the possibilities one could do with the document to such an extent, that it was impractical to continue using it. The final acceptance of this came when I tried to create two indexes for the whole book. It just did not work.

The same effect is achieved more or less with code snippet below. It uses ideas from the combine.cls, but in a much more simplified manner. This is very simple to use. It creates a new environment papers that can be used to enter the individual papers. I have assumed that each paper is included as an individual document; this means, that each paper can have its own \documentclass and \usepackage command, and each paper can be edited individually. To include the papers in the main document, use it like this:


Where each of the papers is of the form:

\title{A paper of the best practices}
\author{Johnny B. Good}
In this paper \ldots

Note that the preamble of the included documents should be kept as simple as possible, as for example refedinitions of commands are not ignored and so on. If the included documents need any special packages, they have to be included in the preamble of the main document.

The code below redefines \documentclass, \usepackages and the document environment to be ignored, and the \maketitle command so that the titles of the paper are printed and added to the table of contents and to the marks for headers. How they are used is a matter of your pagestyle.

% This code includes small pieces from combine.cls and some other sources.
% No guarantees of any kind given. Use at your own discretion.
% Feel free to modify and distribute in any way you see fit.




    \ifthenelse{\equal{\@@subtitle}{}}{%emtpy subtitle
}{%not empty subtitle



Happy LaTeXing!

LaTeX and math symbols in text fonts


While doing the layout of a small magazine in Finnish, I’ve for a long time already been using unicode with LaTeX. It is just so much easier to write everything using unicode under Emacs, and then let LaTeX/TeX take care of the rest. Mostly, this even works quite well, but occasionally there are some surprising quirks. Like yesterday, when  I was doing the layout for the next issue to appear in a few weeks. There was an article about archaeological dating methods, and a few paragraphs dealt with radiocarbon dating, which is characterized by the calibration process required, and the uncertainty in the results, usually expressed with something like 225 AD ± 75 years. Now, of course LaTeX has a symbol for this, you can easily get is using


Now, the problem with this is, that in the middle of the text, using the face Garamond,with old style figures, there suddenly appears a mathematical symbol in TeX’s own math font, which is naturally quite beautiful, but does not really look that nice within the surrounding Garamond environment. Especially since the Garamond I’m using does include a glyph of its own for the plusminus symbol. Now, since I’m working with Unicode (or, to be more specific, text encoded as UTF-8), the natural choice was to find out, how to enter the corresponding symbol using Emacs (‘_’ + ‘+’ in the input-method latin-9-prefix).

Now, to make LaTeX handle UTF-8, I’ve been using the package inputenc like this:


I have read somewhere, the one actually should use the option “utf8” instead of “utf8x” as it is better supported or something, but in practice, “utf8” never works well, and alway, even for simple texts, calls for me to enter some kind of declarations for special characters and so on, so I’ve been sticking to “utf8x” this far. Now, one would expect, that using this setup, when LaTeX encounter the UTF-8 encoded ± in the text, it would find the corresponding glyph in the font and use that. But no, that is not what happens; instead, even though using ± in the middle of the text, LaTeX still finds the glyph in the TeX math font. Why?

Well, that’s because of this code in the file uni-0.def:


As we see, it forces the math mode on, and thus ensures, that this symbol is always taken from the math fonts, no matter whether the text font has it or not!

Now, a remedy: use the package textcomp, which has the command \textpm. That picks the right glyph from the right font! This is quite stupid though, because the whole point in using unicode is not having to use these LaTeX commands to arrive at special characters.

And it remains to be seen, whether the option “utf8” to inputenc would give better results in this case. Perhaps I’ll test that at some point.

Emacs, flyspell-mode and “centralized”

It was a rather irritating fight, today. It’s been a while since I last tried Emacs’ flyspell mode, which is supposed to check you writing on-the-fly, as the name implies. It works quite well, yes, but I soon got irritated by all the suggested corrections for “-ize” endings, like “centralized” should be replaced by “centralised”, which looks sort of unnatural to me. I had to check the Oxford English Dictionary, which does not even recognize the form “centralise”, and this made me pretty irritated!

Google came to help, or so it seemed. I ended up looking at wordlist packages for Debian etc., but lets cut the story short. The default spell checking program used by flyspell-mode is “ispell”, which is a venerable program, functional, but outdated by, for example, aspell — which then suddenly had tens of various dictionaries available, amongst which I found various with -ize suffixes.

Now, the only question that remains is, why does Emacs Ispell-mode default to ispell instead of aspell?

A Very Small Example of Applicative Functors in Haskell


This is to document a small, working example of how applicative functors can be used in Haskell.

import Control.Applicative

f1:: Int -> Int -> Int
f1 x y = 2*x+y

main = do return $ show $ f1 <$> (Just 1) <*> (Just 2)

A very short explanation follows.

On line 1, the necessary base library module is imported

On lines 3 and 4, a small function from two integers to one integer is defined

On line 6, it is shown, how the applicative functors are used to apply the function on Maybe- values.

Archives of European Sociology


This time I’ll be writing something more connected to history than computers, but it has pretty much to do with modern technology, anyway.

If you do a Google search for the journal title “Archives of European Sociology”, you’ll get a long list of citations to a journal by that name, like “Brubaker, Rogers, Ethnicity without Groups, Archives of European Sociology, 18, 2 2002”. Now, my wife, who happens to be a historian, tried to find this journal, as she wanted to see an article published there. Based on the huge amount of citations to the journal found by Google, she of course assumed, that the journal was well-known, widely distributed, and logically, available at the local university library. To her big surprise, she was not able to find the journal in any library database — the closest match was the Archives européennes de sociologie, and international publication, that also had an english title (European Journal of Sociology) and a German title (Europäisches Archiv für Soziologie). That obviously could not be it, as the Journals home page at the publishers site very clearly stated the title in all three languages.

But, in the end, a comparison of the references to the mysterious journal with the table of contents -data at the publishers site did show, that this was, after all, the mysterious Archives of European Sociology. Why on earth was it always referred to under this name, when the publisher, and the journal itself, very clearly used the english title European Journal of Sociology? The thing remained a mystery, until today, when she found one potential explanation for this misnomer:

More and more, the scientific journals have been adopting the convention, that the headers and footers of the pages include, in addition to the page number, the authors name and a part of the title, also a reference to the journal itself: the name of the journal, the year, volume and number of the current issue, and the pages covered by the article in question; most often this information appears in the footer of the first page of each article. This is a wonderful habit, as it saves the nerves of so many academics fervently copying articles and trying to sort the piles of copies later. The Archives européennes de sociologie had adopted this policy already in the yearly 1980’s. And you know what? They did not want to print the whole name of the journal in the footer, as it was rather long, and would have forced the footer to extend to the second line; instead, they used an abbreviation — also a venerable habit. The abbreviation was Arch. europ. sociol.

One can just imagine, how the academic sorting his piles of photocopies finds this interesting article he did remember having somewhere, is convinced of its value, and decided to cite it in his/her next work. But where did the article come from? Luckily the necessary information is included in the photocopy itself: “Arch. europ. sociol.” Now, if you’re mother tongue is English, and you have this article written in English from a journal, the name of which is abbreviated thus, what is the logical English name that can be constructed from that: “Archives of European Sociology”.