PostGIS

From Intamap

Contents

What is PostGIS

PostGIS is an extension to the PostgreSQL relational data base. It is used to store spatial data (so-called simple features: points, lines, polygons, and combinations of them) and provides spatial geometry tests (e.g. Touches, Intersects, Contains) and operations (Buffer, GeomUnion, Distance) and coordinate transformation (Transform). Note that R can also do coordinate transformation, read INSPIRE, and that both use the PROJ.4 Cartographic Projections Library for this.

Having spatial data in a spatial data base (as opposed to lying around in separate files on the hard drive) has several advantages:

  • the data base can do consistency checks
  • data bases are usually scalable and reliable
  • coordinate reference systems (SRID, spatial reference identifier) can be consistently dealt with
  • the I/O routes have usually been established by others

PostGIS setup

I set up a PostGIS system on a 64-bits debian etch (testing) system. It has a postgis package, and selecting this will probably draw the whole bunch of postgreql things in. I also installed odbc-postgresql and odbcinst1debian1, and r-cran-rodbc (which may take unixodbc-dev with it - it's a while back).

Next, you have to add users to the data base with the command createuser. This has to be done as user postgres (there's no passwd for this user, so su to root, and then su postgres). I added myself with passwd (had troubles with odbc if there was no passwd), and with superuser and create permissions.

Next, you create a data base, using e.g. createdb postgis, and start the interactive sql shell to it by psql postgis. To spatially-enable this data base, you have to add the following commands:

createlang plpgsql $1
psql -d $1 -f /usr/share/postgresql-8.1-postgis/lwpostgis.sql
psql -d $1 -f /usr/share/postgresql-8.1-postgis/spatial_ref_sys.sql

where paths may be different on your system. These commands add the needed data types (points, lines, polygons), functionality (spatial operations), and data (spatial reference systems), and set up links to libraries used for all this (geom, proj.4).

A test to see whether things work could be

psql postgis < meuse_gis.sql

where you obtain meuse_gis.sql from here.

ODBC connections

the .odbc.ini file in my home directory looks like this:

[PostgreSQL postgis]
Description         = PostgreSQL
Driver              = PostgreSQL ANSI
Trace               = No
TraceFile           = /tmp/psqlodbc.log
Database            = postgis
Servername          = localhost
UserName            =
Password            =
Port                =
ReadOnly            = No
RowVersioning       = No
ShowSystemTables    = No
ShowOidColumn       = No
FakeOidIndex        = No
ConnSettings        =

/etc/odbc.ini is empty (but I may have to move things there), but file /etc/odbcinst.ini looks like this

[PostgreSQL ANSI]
Description             = PostgreSQL ODBC driver (ANSI version)
Driver          = /usr/lib/odbc/psqlodbca.so
Setup           = /usr/lib/odbc/libodbcpsqlS.so
Debug           = 0
CommLog         = 1
UsageCount              = 1

[PostgreSQL Unicode]
Description             = PostgreSQL ODBC driver (Unicode version)
Driver          = /usr/lib/odbc/psqlodbcw.so
Setup           = /usr/lib/odbc/libodbcpsqlS.so
Debug           = 0
CommLog         = 1
UsageCount              = 1

and probably came with deb package odbc-postgresql.

PostGIS I/O

ogr2ogr example

ogr2ogr converts simple features data between file formats, and can thus read and write to PostGIS data bases, from or to one of the OGR supported formats, one of them being KML (read-only, gdal 1.3.2).

shp2pgsql example

I imported the nuts data obtained from Eurostat in the PostGIS data base by the following commands:

shp2pgsql -s 4019 ~paul/public_html/GISCO/NUTS/NUTS_RG_20M_2007/NUTS_RG_20M_2007 nuts20 | psql postgis > nuts20.out
shp2pgsql -s 4019 ~paul/public_html/GISCO/NUTS/NUTS_RG_10M_2007/NUTS_RG_10M_2007 nuts10 | psql postgis > nuts10.out
shp2pgsql -s 4019 ~paul/public_html/GISCO/NUTS/NUTS_RG_03M_2007/NUTS_RG_03M_2007 nuts03 | psql postgis > nuts03.out
shp2pgsql -s 4019 ~paul/public_html/GISCO/NUTS/NUTS_RG_01M_2007/NUTS_RG_01M_2007 nuts01 | psql postgis > nuts01.out

Note that I'm not 100% sure about the SRID 4019: I manually compared the .prj file that accompanies the shapefile (.shp):

GEOGCS["GCS_ETRS_1989",DATUM["D_ETRS_1989",SPHEROID["GRS_1980",6378137.0,298.257222101]],PRIMEM["Greenwich",0.0],UNIT["Degree",0.0174532925199433]]

with the entries in the spatial_ref_sys table in PostGIS, for 4019:

4019 | EPSG      |      4019 | GEOGCS["Unknown datum based upon the GRS 1980 ellipsoid",DATUM["Not_specified_based_on_GRS_1980_ellipsoid",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],AUTHORITY["EPSG","6019"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.01745329251994328,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4019"]]                                                                                                                                                                                                                                                                                                                                                                                                                                          
 +proj=longlat +ellps=GRS80 +no_defs

If not given, SRID -1 (longlat, WGS84) is assigned, and may lead to very similar if not identical results. Let me know if you find any problems.

(Intamap members can be granted access to these data when they obtain a username/passwd access to the data; replace the prefix ~paul/public_html with http://intamap.geo.uu.nl/~paul )

qgis

Quantum GIS (qgis) is a free map viewer that can directly read data from a PostGIS data base. I configured the PostGIS connection widget as follows:

Image:Qgis1.png

Choosing nuts01, after some zooming and reprojection (to our favourite LAEA-ETRS projection), we get

Image:Qgis2.png

Connecting PostGIS with R

The principal idea is that we insert monitoring network data in PostGIS, and then call R with an appropriate script, which includes information on what the data are called in the PostGIS data base.

through ODBC

odbc (open data base connectivity) does not typically know about spatial issues; everything has to go through SQL statements, and plain tables (data.frame objects) are typically returned. However, ogr also has an odbc driver, which may be used for spatial data connections (I did not try this).

library(RODBC)
channel <- odbcConnect("PostgreSQL postgis", "edzer", "edzer")

# retrieve table from data base:
table = sqlQuery(channel, "select nuts_id from nuts01c;")
# select and sort the country codes:
ids = factor(sort(as.character(table$nuts_id)))

# nuts01c has only the country borders and was formed by the SQL command:
#  select * into nuts01c from nuts01 where length(nuts_id)=2;
neighb = matrix(0, length(ids), length(ids))
dimnames(neighb) = list(ids, ids)
for (id in levels(ids)) {
  # form the SQL command:
  sqlCommand = paste(
        "SELECT a.nuts_id FROM nuts01c a, nuts01c b WHERE b.nuts_id = '", id,
        "' AND a.the_geom && b.the_geom AND touches(a.the_geom , b.the_geom) ",
        "GROUP BY a.nuts_id ORDER BY a.nuts_id;", sep = "")
  # do the SQL query:
  ret = sqlQuery(channel, sqlCommand)$nuts_id
  # fill the relevant matrix entries:
  neighb[match(id, ids), match(ret, ids)] = 1
}
odbcClose(channel)

options(width=110)
neighb

It outputs the table that indicates which countries share a common boundary:

   AT BE BG CH CY CZ DE DK EE ES FI FR GR HR HU IE IS IT LI LT LU LV MT NL NO PL PT RO SE SI SK TR UK
AT  0  0  0  1  0  1  1  0  0  0  0  0  0  0  1  0  0  1  1  0  0  0  0  0  0  0  0  0  0  1  1  0  0
BE  0  0  0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  1  0  0  1  0  0  0  0  0  0  0  0  0
BG  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  0
CH  1  0  0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
CY  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
CZ  1  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  1  0  0
DE  1  1  0  1  0  1  0  1  0  0  0  1  0  0  0  0  0  0  0  0  1  0  0  1  0  1  0  0  0  0  0  0  0
DK  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
EE  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0
ES  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0
FI  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  0  0  0  0
FR  0  1  0  1  0  0  1  0  0  1  0  0  0  0  0  0  0  1  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0
GR  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0
HR  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0
HU  1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1  1  0  0
IE  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
IS  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
IT  1  0  0  1  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0
LI  1  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
LT  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  1  0  0  0  0  0  0  0
LU  0  1  0  0  0  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
LV  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0
MT  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
NL  0  1  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
NO  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0
PL  0  0  0  0  0  1  1  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0
PT  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
RO  0  0  1  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
SE  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0
SI  1  0  0  0  0  0  0  0  0  0  0  0  0  1  1  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
SK  1  0  0  0  0  1  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0
TR  0  0  1  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
UK  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Of course computing the lower or upper triangle of the matrix would have sufficed. Needless to say that it would have been as easy to compute e.g. shortest distances between countries.

through OGR

reading from PostGIS

Running the R script:

library(rgdal)
meuse = readOGR("PG:dbname=postgis", "meuse2")
meuse[1:10,] # first ten rows
proj4string(meuse)

we get as output

> library(rgdal)
Loading required package: sp
Geospatial Data Abstraction Library extensions to R successfully loaded
> meuse = readOGR("PG:dbname=postgis", "meuse2")
> meuse[1:10,] # first ten rows
OGR data source with driver: PostgreSQL
Source: "PG:dbname=postgis", layer: "meuse2"
with  155  rows and  2  columns
        coordinates id zinc
1  (181072, 333611)  1 1022
2  (181025, 333558)  2 1141
3  (181165, 333537)  3  640
4  (181298, 333484)  4  257
5  (181307, 333330)  5  269
6  (181390, 333260)  6  281
7  (181165, 333370)  7  346
8  (181027, 333363)  8  406
9  (181060, 333231)  9  347
10 (181232, 333168) 10  183
> proj4string(meuse)
OGR data source with driver: PostgreSQL
Source: "PG:dbname=postgis", layer: "meuse2"
with  155  rows and  2  columns
[1] " +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.999908 +x_0=155000 +y_0=463000 +ellps=bessel +towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812 +units=m +no_defs"

Note that the projection information (SRID 28992) was retained, and passed to R through the OGR link. See for information about the OGR PostgreSQL driver this link.

writing to PostGIS

I'm going to write the first ten records of meuse back to the PostGIS data base using the R script:

library(rgdal)
writeOGR(meuse[1:10,], "PG:dbname=postgis", "meuse3", "PostgreSQL")

then, in an interactive psql session:


postgis=# select id, X(wkb_geometry), Y(wkb_geometry), zinc from meuse3;
 id |   x    |   y    | zinc
----+--------+--------+------
  1 | 181072 | 333611 | 1022
  2 | 181025 | 333558 | 1141
  3 | 181165 | 333537 |  640
  4 | 181298 | 333484 |  257
  5 | 181307 | 333330 |  269
  6 | 181390 | 333260 |  281
  7 | 181165 | 333370 |  346
  8 | 181027 | 333363 |  406
  9 | 181060 | 333231 |  347
 10 | 181232 | 333168 |  183
(10 rows)

postgis=# select srid(wkb_geometry) from meuse3 limit 1;
 srid
-------
 32767
(1 row)

postgis=# select proj4text from spatial_ref_sys where srid=32767;
                                              proj4text                                   
---------------------------------------------------------------------------------------------------------
 +proj=sterea +lat_0=52.15616055555555 +lon_0=5.38763888888889 +k=0.999908 +x_0=155000 +y_0=463000 +ellps=bessel towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812 +units=m +no_defs
(1 row)

Interestingly, in this case after PostGIS -> R -> PostGIS we end up with a different SRID, but a fitting proj4string.