R
by Adrian Baddeley. This is an immensely helpful overview of point pattern theory and its implementation in R
. If you plan on applying point pattern analysis to your data, I strongly recommend you download and read this text, particularly pages 6 - 17 and 178 - 180.R
to plot the results of our analyses. You can generate similar plots using ggplot2
, but since the focus of this lab is understanding point pattern analysis and the spatstat
package, we will only use the simple plot()
function.library(sp)
library(tidyverse)
library(spatstat)
library(raster)
NOTE: Point pattern analysis depends on DISTANCES so be absolutely sure that your data is in a PROJECTED COORDINATE SYSTEM.
Today, I’ll work with a few data sets. Many some standard with the spatstat
package (lansing
, longleaf
, etc.). One set of data you will have to download from Chapter 8 of the R Spatial online textbook. At the top of Chapter 8, you will see hyperlinked crime
, city
, and census
.rds
files which you should download and store locally on your computer before beginning the tutorial. These .rds
files are shapefiles containing census data, a city boundary, and incidences of crime.
city <- readRDS("./data/city.rds")
census <- readRDS("./data/census2000.rds")
crime <- readRDS("./data/crime.rds")
plot(city)
plot(crime, add=T, col = "red" , cex = 0.5, pch = "*")
The city
SpatialPolygonsDataFrame
delineates the boundary of a city. The census
data set contains lots of data on the demographics of census blocks within the city. Crime
contains information about crimes that occurred in the city as well as their location. The types of crimes and number of observations within each crime category are listed below:
crime@data %>% group_by(CATEGORY) %>% summarize(n = n()) %>% arrange(desc(n))
## # A tibble: 15 x 2
## CATEGORY n
## <chr> <int>
## 1 Petty Theft 665
## 2 Vandalism 355
## 3 Drunk in Public 232
## 4 Vehicle Burglary 221
## 5 Residential Burglary 219
## 6 DUI 212
## 7 Assaults 172
## 8 Commercial Burglary 143
## 9 Grand Theft 143
## 10 Drugs or Narcotics 134
## 11 Auto Theft 86
## 12 Robbery 49
## 13 Weapons 15
## 14 Arson 9
## 15 Hate Crimes 6
Point data gives the location of objects (trees) or events (crime) within an area. Often, in addition to coordinates, points have other data attached to them called marks. Marks are essentially attributes of the point (i.e. species of tree, type of crime). Marks could be categorical (type of crime) or continuous (diameter of tree). Marks are different from covariates. Marks are attributes associated with the point. Covariates are any kind of data that can be linked to the points as an explanatory variable (climate data, elevation, socioeconomic data, land use, etc.). We can answer a number of important questions with point-pattern data, including:
We can think of a point process as a black box that generates a random observed spatial point pattern according to some rules. Point processes are assumed to be (1) random (locations and numbers of points are random) and (2) do not have a serial order (unless there are marks attached). Point processes are located within a sampling window which is important to identify and define.
spatstat
In this tutorial, we’ll be working with the spatstat
package, so make sure you’ve installed this on your machine. spatstat
stores data a bit differently than the sp
package, so I’ll begin by explaining the peculiarities of this package. spatstat
classifies objects in the following way:
ppp
: planar point patternowin
: spatial region or observation windowim
: pixel imagepsp
: a pattern of line segments (we won’t cover this in our class, but if you are working with polylines and are interested, check out this text)tess
: tesselations, tiling using shapes, think shapefileThere are a series of methods associated with each class. To get this list, type:
methods(class = "ppp") # replace ppp with class name of interest
## [1] [ [<- affine anyDuplicated
## [5] as.data.frame as.im as.layered as.owin
## [9] as.ppp auc berman.test boundingbox
## [13] boundingcentre boundingcircle boundingradius by
## [17] cdf.test circumradius closepairs closing
## [21] coerce connected coords coords<-
## [25] crossdist crosspairs cut density
## [29] dilation distfun distmap domain
## [33] duplicated edit envelope erosion
## [37] fardist flipxy Frame<- has.close
## [41] head identify initialize intensity
## [45] iplot is.connected is.empty is.marked
## [49] is.multitype kppm markformat marks
## [53] marks<- multiplicity nnclean nncross
## [57] nndensity nndist nnfun nnwhich
## [61] nobjects npoints opening pairdist
## [65] pcf periodify pixellate plot
## [69] ppm print quadrat.test quadratcount
## [73] quantess rebound relevel relrisk
## [77] rescale rhohat roc rotate
## [81] round rounding rshift scalardilate
## [85] scanmeasure segregation.test sharpen shift
## [89] show slotsFromS3 Smooth Smoothfun
## [93] split split<- subset summary
## [97] superimpose tail unique unitname
## [101] unitname<- unmark unstack Window
## [105] Window<-
## see '?methods' for accessing help and source code
It’s fairly easy to transform data between SpatialPoints
sp
objects and the ppp
objects used in spatstat
. Let’s start by loading data from the spatstat
package:
data("swedishpines")
x <- swedishpines
plot(x)
This sample data sets plots the occurrence of Swedish Pines within a small window.
class(x)
## [1] "ppp"
summary(x)
## Planar point pattern: 71 points
## Average intensity 0.007395833 points per square unit (one unit = 0.1
## metres)
##
## Coordinates are integers
## i.e. rounded to the nearest unit (one unit = 0.1 metres)
##
## Window: rectangle = [0, 96] x [0, 100] units
## Window area = 9600 square units
## Unit of length: 0.1 metres
Notice that the x
variable we created is of class ppp
. We see that it includes 71 points
and that the average intensity (points/unit area) of these points is .0074 points per square unit
or in this case 0.74 trees per square meter.
Unless you acquire magical data, your data will not come to you in ppp
format. Working things into ppp
(then later sp
for advanced plotting) will help you easily apply the complex analyses and functions listed below to your data, so we’re going to work through an example of loading a ’.csv
of coordinates and attributes into R
, transforming this to a ppp
object, then to an sp
SpatialPoints
object. Let’s look at the location of dumps in the western US:
lf <- read.table("./data/landfill.txt", sep=",", stringsAsFactors = F, skip = 1)
colnames(lf) <- c("id", "lat", "lon")
# 1. store x and y coords in two vectors
lon <- lf$lon
lat <- lf$lat
# 2. create two vectors xrange and yrange with dimensions of triangle that contain all points
xrange <- range(lon, na.rm=T)
yrange <- range(lat, na.rm=T)
# 3. create ppp
lf <- ppp(lon, lat, xrange, yrange)
## Warning: data contain duplicated points
plot(lf)