- Understand what constitutes a point process
- Detect and create point-pattern density surfaces
- Identify how points are clustered in space (uniform, random, clustered).
- Visualize market point processes.

- Read Chapter 8: Point Pattern Analysis of the R Spatial online textbook. Our lab will closely follow this chapter, with some fun bonuses weaved in from other sources, the most notable of which is:
- Analysing spatial point patterns in
`R`

by Adrian Baddeley. This is an immensely helpful overview of point pattern theory and its implementation in`R`

. If you plan on applying point pattern analysis to your data, I*strongly*recommend you download and read this text, particularly pages 6 - 17 and 178 - 180.

- In this tutorial, I’ll be mostly using base
`R`

to plot the results of our analyses. You can generate similar plots using`ggplot2`

, but since the focus of this lab is*understanding*point pattern analysis and the`spatstat`

package, we will only use the simple`plot()`

function. - I assume that you know what I mean when I talk about a variable’s “distribution”, you understand what a goodness-of-fit test is (i.e. \(\chi^2\)), and you are familiar with cumulative distribution functions. No? Then watch this for a review of CDF and this for a review of \(\chi^2\).

```
library(sp)
library(tidyverse)
library(spatstat)
library(raster)
```

NOTE: Point pattern analysis depends on DISTANCES so be absolutely sure that your data is in a **PROJECTED COORDINATE SYSTEM.**

Today, I’ll work with a few data sets. Many some standard with the `spatstat`

package (`lansing`

, `longleaf`

, etc.). One set of data you will have to download from Chapter 8 of the R Spatial online textbook. At the top of Chapter 8, you will see hyperlinked `crime`

, `city`

, and `census`

`.rds`

files which you should download and store locally on your computer before beginning the tutorial. These `.rds`

files are shapefiles containing census data, a city boundary, and incidences of crime.

```
city <- readRDS("./data/city.rds")
census <- readRDS("./data/census2000.rds")
crime <- readRDS("./data/crime.rds")
plot(city)
plot(crime, add=T, col = "red" , cex = 0.5, pch = "*")
```

The `city`

`SpatialPolygonsDataFrame`

delineates the boundary of a city. The `census`

data set contains lots of data on the demographics of census blocks within the city. `Crime`

contains information about crimes that occurred in the city as well as their location. The types of crimes and number of observations within each crime category are listed below:

`crime@data %>% group_by(CATEGORY) %>% summarize(n = n()) %>% arrange(desc(n))`

```
## # A tibble: 15 x 2
## CATEGORY n
## <chr> <int>
## 1 Petty Theft 665
## 2 Vandalism 355
## 3 Drunk in Public 232
## 4 Vehicle Burglary 221
## 5 Residential Burglary 219
## 6 DUI 212
## 7 Assaults 172
## 8 Commercial Burglary 143
## 9 Grand Theft 143
## 10 Drugs or Narcotics 134
## 11 Auto Theft 86
## 12 Robbery 49
## 13 Weapons 15
## 14 Arson 9
## 15 Hate Crimes 6
```

Point data gives the location of objects (trees) or events (crime) within an area. Often, in addition to coordinates, points have other data attached to them called *marks*. Marks are essentially attributes of the point (i.e. species of tree, type of crime). Marks could be categorical (type of crime) or continuous (diameter of tree). Marks are different from *covariates*. Marks are attributes associated with the point. Covariates are any kind of data that can be linked to the points as an explanatory variable (climate data, elevation, socioeconomic data, land use, etc.). We can answer a number of important questions with point-pattern data, including:

- What is the
*intensity*of a process (e.g. crime hot spots)? - How are the points
*distributed*in space (e.g. uniform, random, clustered, see below)? - How does the intensity of points
*covary*with other processes (e.g. species density and elevation, cancer and pollution exposure)? - How do points with different marks
*segregate*across space (e.g. how do the distributions of species age and size vary differently across space?)

We can think of a *point process* as a black box that generates a random observed spatial point pattern according to some rules. Point processes are assumed to be (1) random (locations and numbers of points are random) and (2) do not have a serial order (unless there are marks attached). Point processes are located within a sampling window which is important to identify and define.

`spatstat`

In this tutorial, we’ll be working with the `spatstat`

package, so make sure you’ve installed this on your machine. `spatstat`

stores data a bit differently than the `sp`

package, so I’ll begin by explaining the peculiarities of this package. `spatstat`

classifies objects in the following way:

`ppp`

: planar point pattern`owin`

: spatial region or observation window`im`

: pixel image`psp`

: a pattern of line segments (we won’t cover this in our class, but if you are working with polylines and are interested, check out this text)`tess`

: tesselations, tiling using shapes, think shapefile

There are a series of methods associated with each class. To get this list, type:

`methods(class = "ppp") # replace ppp with class name of interest`

```
## [1] [ [<- affine anyDuplicated
## [5] as.data.frame as.im as.layered as.owin
## [9] as.ppp auc berman.test boundingbox
## [13] boundingcentre boundingcircle boundingradius by
## [17] cdf.test circumradius closepairs closing
## [21] coerce connected coords coords<-
## [25] crossdist crosspairs cut density
## [29] dilation distfun distmap domain
## [33] duplicated edit envelope erosion
## [37] fardist flipxy Frame<- has.close
## [41] head identify initialize intensity
## [45] iplot is.connected is.empty is.marked
## [49] is.multitype kppm markformat marks
## [53] marks<- multiplicity nnclean nncross
## [57] nndensity nndist nnfun nnwhich
## [61] nobjects npoints opening pairdist
## [65] pcf periodify pixellate plot
## [69] ppm print quadrat.test quadratcount
## [73] quantess rebound relevel relrisk
## [77] rescale rhohat roc rotate
## [81] round rounding rshift scalardilate
## [85] scanmeasure segregation.test sharpen shift
## [89] show slotsFromS3 Smooth Smoothfun
## [93] split split<- subset summary
## [97] superimpose tail unique unitname
## [101] unitname<- unmark unstack Window
## [105] Window<-
## see '?methods' for accessing help and source code
```

It’s fairly easy to transform data between `SpatialPoints`

`sp`

objects and the `ppp`

objects used in `spatstat`

. Let’s start by loading data from the `spatstat`

package:

```
data("swedishpines")
x <- swedishpines
plot(x)
```

This sample data sets plots the occurrence of Swedish Pines within a small window.

`class(x)`

`## [1] "ppp"`

`summary(x)`

```
## Planar point pattern: 71 points
## Average intensity 0.007395833 points per square unit (one unit = 0.1
## metres)
##
## Coordinates are integers
## i.e. rounded to the nearest unit (one unit = 0.1 metres)
##
## Window: rectangle = [0, 96] x [0, 100] units
## Window area = 9600 square units
## Unit of length: 0.1 metres
```

Notice that the `x`

variable we created is of class `ppp`

. We see that it includes `71 points`

and that the average intensity (points/unit area) of these points is `.0074 points per square unit`

or in this case 0.74 trees per square meter.

Unless you acquire magical data, your data will not come to you in `ppp`

format. Working things into `ppp`

(then later `sp`

for advanced plotting) will help you easily apply the complex analyses and functions listed below to your data, so we’re going to work through an example of loading a ’`.csv`

of coordinates and attributes into `R`

, transforming this to a `ppp`

object, then to an `sp`

`SpatialPoints`

object. Let’s look at the location of dumps in the western US:

```
lf <- read.table("./data/landfill.txt", sep=",", stringsAsFactors = F, skip = 1)
colnames(lf) <- c("id", "lat", "lon")
```

```
# 1. store x and y coords in two vectors
lon <- lf$lon
lat <- lf$lat
# 2. create two vectors xrange and yrange with dimensions of triangle that contain all points
xrange <- range(lon, na.rm=T)
yrange <- range(lat, na.rm=T)
# 3. create ppp
lf <- ppp(lon, lat, xrange, yrange)
```

`## Warning: data contain duplicated points`

`plot(lf)`