Pre-processing eurdep

From Intamap

Pre-Processing

In the preprocessing step, data is read, different parameters set, and an object is created for parameter estimation and prediction. The pre-processing steps for the Eurdep data so far consist of following routines:

eurdepLoad

Function that loads eurdep data. The syntax is:

intamapLoad(fileName, dataType,...)

Possible dataTypes are:

  • eurdep - Either eurdep data sets or the simulations
    • If simulations are used, it is also necessary to pass the argument aggr=FALSE
    • Possible names of simulations:
    • INTAMP_scenario6_1080.txt, INTAMP_scenario6_1800.txt ...
  • eurdepAverage - monthly or annual averages from 2006
    • File names: 2006_1.csv, 2006_2.csv, ...
  • ... Other arguments to be passed to sub-procedures

The output from eurdepLoad is a SpatialPointsDataFrame with coordinates, observations, and some additional data. Columns of this dataframe must at least be:

  • regCode = region code, for the Eurdep data the two letter country code, stored as factor in the SPDF.
  • stationNumber = the station ID, stored as a factor in the SPDF
  • value = the observed value

Aditionally, it can include:

  • elevation = the elevation of the station, either read directly from the data, or found from a DEM
  • soil_type = code for soil type
  • timbeg = the start time of the observation. The time is given in posixt and posixct format.
  • timend = the end time of the observation
  • group = the network group number (default is 1, different if there are more than one network in a country)
  • cluster = the cluster number (default is 1, can be changed by the Cluster identifying module P3)
  • device = type of device used for monitoring
  • uncertainty = Uncertainty of observation, error characteristics (Aston, we need some guidance how to define this one)
  • spatSup = spatial support of observation


Creation of data object

The suggestion is that we create one object which contains all necessary information. The creation of a data object is so far done manually. This is created as a list(), where further data can be included at a later stage. At the moment, for a data set to be kriged, we can create the object as following:

> krigingObject = list(
>	pointData = observations,
>	predictionLocations = predictionLocations,
>	targetCRS = "+init=epsg:3035",
>     formulaString = as.formula(value~1),
>     ck = ck,
>	params = getParams()
> )
> class(krigingObject) = c("intamap","eurdep","automap")

This object hence belongs to three classes, "intamap", "eurdep" and "automap". All intamap objects should of course belong to the first class, whereas the other two classes depends on the data type and the method to be applied. The data type should only be of concern for the pre- and post-processing steps.

The object also includes the observations, prediction locations, the target projection, information about which value is to be interpolated (and relations in case of universal kriging), a list of countries and some parameters.

The function getParams sets several parameters for both pre-processing and interpolation. Among the parameters included so far (with default values):

> formulaString= as.formula(value~1)
> doAnisotropy = FALSE
> removeBias= c("localBias","countryBias")
> addBias = c("countryBias")
> doCluster = FALSE
> maxCluster = 0
> numberOfClusters = 1
> methodIdentifier = 6
> isEmergency = FALSE

Not all of these parameters are used yet. The parameters of removeBias and addBias refers to which sorts of biases/drifts to remove/add. Also elevation and soil effects can be added to this parameter. If some parameters are to be changed from the default, this can be done e.g. by:

> params = getParams(doAnisotropy=TRUE)


preProcess.eurdep

The pre-processing step for the eurdep data does the following:

  • Cleans data, removes multiple observations at same location (a more robust implementation is needed here)
  • Removes biases
    • Locally (if there are multiple networks in a country, at the moment Slovenia and Hungary
    • Regionally (the biases between each country)
  • Attaches the biases to the data object for possible addition of the biases in the post-processing step
  • Finds the elevations of the stations and the prediction locations from a DEM if not included at an earlier stage
  • Subtracts the elevation effect (if specified in the parameters)
  • The two last functions are not finally implemented



Still not fully implemented - taken from old description of setup

Data treatment

If not yet done by the network owner or the data base owner, this module should check the data and at least

  • identify possible data errors and remove errors
  • identify possible emergency situations and switch the system into emergency mode

or be able to classify events according to the list above, unless done by the network owner.