class: middle, center, title-slide # Maps ## STAE04: Data Visualization ### Johan Larsson ### The Department of Statistics, Lund University --- ## Maps Data in the wild is often **spatial** in nature. Basic functionality for working with maps is available directly in **ggplot2**. For more advanced functionality, we need to involve other packages (and ggplot in fact relies on some of them for features such as map projections). typically a good idea to go for a minimalist theme when mapping ```r library(tidyverse) theme_set(theme_void()) ``` --- ## Simple Maps `map_data()` converts various objects to data that ggplot can use and offers a simple interface to the [maps](https://CRAN.R-project.org/package=maps) package. ```r world <- map_data("world") ggplot(world, aes(long, lat, group = group)) + geom_polygon() + coord_map(xlim = c(-180, 180), ylim = c(-52, 83.6)) ``` <img src="14-maps_files/figure-html/unnamed-chunk-3-1.png" width="576" style="display: block; margin: auto;" /> --- ## Spatial Data Maps are nice in and by themselves, but what we're really looking for is to visualize some data. .pull-left[ Spatial data comes in different forms: - vector (polygon) data - point (coordinate) data - raster data These different types of data can (and often need to be) combined. ] .pull-right[ <img src="images/raster-vector-data.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Vector Data **example:** crime data from USA states from 1974 ```r usa <- map_data("state") arr <- as_tibble(USArrests, rownames = "region") %>% mutate(region = tolower(region)) usa_arr <- left_join(usa, arr) ``` <table class="table" style="font-size: 20px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> long </th> <th style="text-align:right;"> lat </th> <th style="text-align:right;"> group </th> <th style="text-align:right;"> order </th> <th style="text-align:left;"> region </th> <th style="text-align:left;"> subregion </th> <th style="text-align:right;"> Murder </th> <th style="text-align:right;"> Assault </th> <th style="text-align:right;"> UrbanPop </th> <th style="text-align:right;"> Rape </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> -87.5 </td> <td style="text-align:right;"> 30.4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> <tr> <td style="text-align:right;"> -87.5 </td> <td style="text-align:right;"> 30.4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> <tr> <td style="text-align:right;"> -87.5 </td> <td style="text-align:right;"> 30.4 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> <tr> <td style="text-align:right;"> -87.5 </td> <td style="text-align:right;"> 30.3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> <tr> <td style="text-align:right;"> -87.6 </td> <td style="text-align:right;"> 30.3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> <tr> <td style="text-align:right;"> -87.6 </td> <td style="text-align:right;"> 30.3 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:left;"> alabama </td> <td style="text-align:left;"> NA </td> <td style="text-align:right;"> 13.2 </td> <td style="text-align:right;"> 236 </td> <td style="text-align:right;"> 58 </td> <td style="text-align:right;"> 21.2 </td> </tr> </tbody> </table> --- ### Choropleths Mapping to fill color produces a **choropleth**. ```r ggplot(usa_arr, aes(long, lat, group = group, fill = Murder)) + geom_polygon() + scale_fill_distiller(direction = 1, palette = "Reds") + coord_map() ``` <img src="14-maps_files/figure-html/unnamed-chunk-7-1.png" width="576" style="display: block; margin: auto;" /> **problem:** state area influence impression of effect! --- ## Point Data coordinates (longitude and latitude) ```r airports <- read_tsv( "https://slcladal.github.io/data/airports.txt", col_names = TRUE ) ``` <table class="table" style="font-size: 20px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:right;"> ID </th> <th style="text-align:left;"> Name </th> <th style="text-align:left;"> City </th> <th style="text-align:left;"> Country </th> <th style="text-align:right;"> Latitude </th> <th style="text-align:right;"> Longitude </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> Goroka Airport </td> <td style="text-align:left;"> Goroka </td> <td style="text-align:left;"> Papua New Guinea </td> <td style="text-align:right;"> -6.08 </td> <td style="text-align:right;"> 145 </td> </tr> <tr> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> Madang Airport </td> <td style="text-align:left;"> Madang </td> <td style="text-align:left;"> Papua New Guinea </td> <td style="text-align:right;"> -5.21 </td> <td style="text-align:right;"> 146 </td> </tr> <tr> <td style="text-align:right;"> 3 </td> <td style="text-align:left;"> Mount Hagen Kagamuga Airport </td> <td style="text-align:left;"> Mount Hagen </td> <td style="text-align:left;"> Papua New Guinea </td> <td style="text-align:right;"> -5.83 </td> <td style="text-align:right;"> 144 </td> </tr> <tr> <td style="text-align:right;"> 4 </td> <td style="text-align:left;"> Nadzab Airport </td> <td style="text-align:left;"> Nadzab </td> <td style="text-align:left;"> Papua New Guinea </td> <td style="text-align:right;"> -6.57 </td> <td style="text-align:right;"> 147 </td> </tr> </tbody> </table> --- one possibility: just plot the points ```r ggplot(airports, aes(Longitude, Latitude)) + geom_point(cex = 0.1, alpha = 0.5) + coord_map() ``` <img src="14-maps_files/figure-html/unnamed-chunk-10-1.png" width="576" style="display: block; margin: auto;" /> --- but we might also want to include the geography ```r ggplot() + geom_polygon(aes(long, lat, group = group), fill = "grey", data = world) + geom_point(aes(Longitude, Latitude), cex = 0.1, data = airports) + coord_map(xlim = c(-180, 180), ylim = c(-52, 83.6)) ``` <img src="14-maps_files/figure-html/unnamed-chunk-11-1.png" width="576" style="display: block; margin: auto;" /> --- ### Adding Layers Points do not always suffice by themselves in revealing patterns. ```r ger_map <- filter(world, region == "Germany") ger_air <- filter(airports, Country == "Germany", Longitude > 5, Latitude < 57) ``` -- .pull-left[ ```r p <- ggplot( ger_air, aes(Longitude, Latitude) ) + geom_polygon( aes(long, lat, group = group), inherit.aes = FALSE, fill = "gray", data = ger_map ) + coord_map() p ``` ] .pull-right[ <img src="14-maps_files/figure-html/unnamed-chunk-13-1.png" width="230.4" style="display: block; margin: auto;" /> ] --- adding density estimates with contour lines and color ```r p + * geom_density_2d_filled(alpha = 0.25) + * geom_density_2d(col = 1) + geom_point() ``` <img src="14-maps_files/figure-html/gerplot-1.png" width="396" style="display: block; margin: auto;" /> --- ## Raster Data Raster data is common in many areas, such as street map or terrain data. The [ggmap](https://CRAN.R-project.org/package=ggmap) package pulls raster map data from [Stamen Maps](http://maps.stamen.com) and [Google Maps](https://maps.google.com). .pull-left[ ```r library(ggmap) bbox <- c(left = -95.39681, right = -95.34188, bottom = 29.73631, top = 29.78400) map <- get_stamenmap( bbox, maptype = "toner", zoom = 14 ) ggmap(map) ``` ] .pull-right[ <img src="14-maps_files/figure-html/unnamed-chunk-14-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ### Crime in Houston ```r crime2 <- filter(crime, offense == "robbery") ``` .pull-left[ ```r ggmap(map) + geom_point( aes(lon, lat), col = "firebrick", alpha = 0.5, data = crime2 ) ``` ] .pull-right[ <img src="14-maps_files/figure-html/unnamed-chunk-16-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ### Crime in Houston ```r ggmap(map, darken = 0.3) + stat_density2d(aes(lon, lat, fill = ..level..), alpha = 0.5, geom = "polygon", data = crime2) + scale_fill_distiller(palette = "YlOrRd", direction = 1) ``` <img src="14-maps_files/figure-html/houston-crime-dens-1.png" width="345.6" style="display: block; margin: auto;" /> .footnote[ This example was adapted from <https://github.com/dkahle/ggmap/>. ] --- ## Geocoding Geocoding converts text (addresses, landmarks) into **coordinates**. `ggmap::geocode()`: interface to Google Maps or [the Data Science Toolkit](http://www.datasciencetoolkit.org/) ```r *wh <- geocode("the white house", source = "dsk") ggplot(map_data("state"), aes(long, lat, group = group)) + geom_polygon(col = "white") + geom_point(x = wh$lon, y = wh$lat, col = "red", size = 4) + coord_map("bonne", lat0 = 40) ``` <img src="14-maps_files/figure-html/unnamed-chunk-17-1.png" width="432" style="display: block; margin: auto;" /> --- ## Networks Networks are common in spatial data. cannot be easily visualized in ggplot; use `ggnetworkmap()`from [GGally](https://CRAN.R-project.org/package=GGally) instead must first construct a network **example:** airports and flights in the US (see the source code for the rather complicated data set) -- ```r usa <- ggplot(map_data("state"), aes(long, lat)) + geom_polygon(aes(group = group), color = "white", fill = "light grey") + coord_map("bonne", lat0 = 40, xlim = c(-120, -70), ylim = c(25, 50)) library(GGally) ggnetworkmap(usa, nw, great.circles = TRUE, alpha = 1, segment.alpha = 0.25, ring.group = degree, weight = degree, segment.color = "firebrick") ``` --- class: center, middle <div class="figure" style="text-align: center"> <img src="14-maps_files/figure-html/unnamed-chunk-19-1.png" alt="Network map of flights in the US." width="720" /> <p class="caption">Network map of flights in the US.</p> </div> --- ## Projections The only truly accurate representation of the earth is as a **sphere**. We need to **project** this spherical surface onto a plane. .pull-left[ For **large-scale** maps, this inevitably leads to **distortions** in one (or several) of the following aspects: - areas - shapes - directions - distances There is a plethora of map projections, which all preserve some of these features at the expense of others. ] .pull-right[ <div class="figure" style="text-align: center"> <img src="images/mercator-tissot.png" alt="Tissot's indicatrices on the Mercator Projection of the world map." width="100%" /> <p class="caption">Tissot's indicatrices on the Mercator Projection of the world map.</p> </div> ] --- ### Conformal preserves angles (shapes) locally ```r world_map <- ggplot(world, aes(long, lat, group = group)) + geom_path() world_map + coord_map("mercator", xlim = c(-180, 180)) ``` <div class="figure" style="text-align: center"> <img src="14-maps_files/figure-html/unnamed-chunk-21-1.png" alt="Mercator" width="432" /> <p class="caption">Mercator</p> </div> --- ### Equal-Area preserves area measure ```r world_map + coord_map("mollweide", xlim = c(-180, 180)) ``` <div class="figure" style="text-align: center"> <img src="14-maps_files/figure-html/unnamed-chunk-22-1.png" alt="Mollweide" width="648" /> <p class="caption">Mollweide</p> </div> --- ### Equidistant preserves distance between any two points ```r world_map + coord_map("azequidistant", xlim = c(-180, 180)) ``` <div class="figure" style="text-align: center"> <img src="14-maps_files/figure-html/unnamed-chunk-23-1.png" alt="Azimuthal Equidistance" width="360" /> <p class="caption">Azimuthal Equidistance</p> </div> --- ### Compromises tries to compromise distortions among the various aspects ```r world_map + coord_map("gall", lat0 = 0, xlim = c(-180, 180)) ``` <div class="figure" style="text-align: center"> <img src="14-maps_files/figure-html/unnamed-chunk-24-1.png" alt="Gall Stereographic" width="648" /> <p class="caption">Gall Stereographic</p> </div> --- ### More Projections See <https://en.wikipedia.org/wiki/List_of_map_projections> for an exhaustive list of different map projections. <div class="figure" style="text-align: center"> <img src="images/projection-waterman.png" alt="The Waterman Butterfly Projection. A compromise projection." width="90%" /> <p class="caption">The Waterman Butterfly Projection. A compromise projection.</p> </div> --- ## More about Maps ### Other Packages There is a large collection of packages in R that deal with spatial data, many with more functionality than ggplot2. See [the CRAN task view on spatial data](https://cran.r-project.org/web/views/Spatial.html) for a comprehensive run-down on these packages and their functionality. ### Where to go Next If you want to read more about mapping with ggplot2, we recommend the [Maps chapter in the ggplot2 book](https://ggplot2-book.org/maps.html). <!-- --- --> <!-- ## References --> <!-- ```{r, results = "asis", echo=FALSE} --> <!-- PrintBibliography(bib) --> <!-- ``` -->