class: middle, center, title-slide .title[ # Maps ] .subtitle[ ## Data Visualization ] .author[ ### Johan Larsson ] .author[ ### Behnaz Pirzamanbein ] .institute[ ### The Department of Statistics, Lund University ] --- ## Maps Data in the wild is often **spatial** in nature. If you find spatial data, your first urge should be to map it! ### Themes for Maps The default ggplot2 theme is not suitable for maps. ```r library(tidyverse) theme_set(theme_void()) ``` `theme_bw()` is also a good choice, particularly if you need coordinates. --- ## Mapping with ggplot2: Two Methods ### 1. Native ggplot2 - `geom_polygon()` and `coord_map()` - no external dependencies - fine for simple maps -- .pull-left[ ### 2. Simple Features What we will use instead - `geom_sf()` and `coord_sf()` - much more powerful - heavy-lifting done by the [sf](https://cran.r-project.org/package=sf) package ] .pull-right[ <img src="images/simple-features.jpg" width="100%" style="display: block; margin: auto;" /> ] --- ## Hello World! ```r library(rnaturalearth) world <- ne_countries(returnclass = "sf") ggplot(world) + geom_sf(fill = "black", col = "white") ``` <img src="maps_files/figure-html/unnamed-chunk-4-1.png" width="720" style="display: block; margin: auto;" /> --- ## Spatial Data Maps are nice in and by themselves, but what we're really looking for is to visualize some data. .pull-left[ ### Forms of Spatial Data - vector (polygon) data - point (coordinate) data - raster data Visualizations can combine these types however you like. ] .pull-right[ <img src="images/raster-vector-data.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Vector Data We often have separate data sources: one for the map and one for the data of interest. We often need to merge these data sets together. Simple features are compatible with most tidyverse operations: `filter()`, `select()`, `mutate()`, `group_by()`, `*_join()` etc. -- ### Example: Crime Data in US states from 1974 ```r # map data usa <- ne_states(iso_a2 = "US", returnclass = "sf") %>% filter(!(postal %in% c("AK", "HI"))) # data of interest arr <- as_tibble(USArrests, rownames = "region") # merge the simple features map with the data usa_arr <- left_join(usa, arr, by = c("gn_name" = "region")) ``` --- ### Choropleths Mapping to fill color produces a **choropleth**. ```r ggplot(usa_arr, aes(fill = Murder)) + geom_sf(col = "white") + scale_fill_distiller(direction = 1, palette = "Reds") ``` <img src="maps_files/figure-html/unnamed-chunk-7-1.png" width="648" style="display: block; margin: auto;" /> It is easy to read, but the area of the region influences the visual impression. --- ### Choropleth Alternatives We can use **sf** package to get centroid of each state and plot bubbles ```r centroids <- sf::st_centroid(usa_arr) ggplot(usa) + geom_sf(col = "white") + geom_sf(aes(size = Murder), data = centroids) ``` <img src="maps_files/figure-html/unnamed-chunk-8-1.png" width="648" style="display: block; margin: auto;" /> It is harder to read, but the visual impression is no longer influenced by area. --- ## Point Data longitude (x-axis coordinates) and latitude (y-axis coordinates) ```r airports_raw <- read_tsv( paste0("https://github.com/stat-lu/dataviz/", "raw/main/data/airports.txt"), col_names = TRUE ) ``` If we want to be able to use projections, we need to convert this data to a simple features object. ```r library(sf) airports <- st_as_sf( airports_raw, coords = c("Longitude", "Latitude"), crs = 4326 # use this if you have longitude and latitudes ) ``` .footnote[ Original data is from <https://slcladal.github.io/data/airports.txt>. ] --- ### Stacking Layers It is easy to stack layers of different geometries (here: polygons and points). ```r ggplot() + geom_sf(data = world, col = "white") + geom_sf(data = airports, cex = 0.1) ``` <img src="maps_files/figure-html/unnamed-chunk-11-1.png" width="720" style="display: block; margin: auto;" /> --- ### Airports in Skåne <ol> <li> filter the data </ol> ```r skane_map <- ne_states("Sweden", returnclass = "sf") %>% filter(name == "Skåne") # filter out only airports in skåne (using st_intersection) airports_sweden <- filter(airports, Country == "Sweden") skane_air <- st_intersection(airports_sweden, skane_map) ``` -- .pull-left[ <ol start = 2> <li> plot it! </ol> ```r p <- ggplot(skane_map) + geom_sf(col = "white") + geom_sf(data = skane_air) p ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-13-1.png" width="252" style="display: block; margin: auto;" /> ] --- ### Adding Text Labels Let's add some labels too! .pull-left[ ```r library(ggrepel) p + geom_sf_label( aes(label = Name), data = skane_air, nudge_y = 0.06, nudge_x = 0.2 ) ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-14-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ## Raster Data Raster data is common in many areas, such as street map or terrain data. The [ggmap](https://CRAN.R-project.org/package=ggmap) package pulls raster map data from [Stamen Maps](http://maps.stamen.com) and [Google Maps](https://maps.google.com). .pull-left[ ```r library(ggmap) bbox <- c( left = -95.39681, right = -95.34188, bottom = 29.73631, top = 29.78400 ) map <- get_stadiamap( bbox, maptype = "stamen_toner", zoom = 14 ) ggmap(map) ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-15-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ### Crime in Houston ```r robberies <- filter(crime, offense == "robbery") ``` Stacking layers on raster maps is also easy. .pull-left[ ```r houston_plot <- ggmap(map) + geom_point( aes(lon, lat), col = "firebrick", alpha = 0.5, data = robberies ) houston_plot ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-17-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ### Crime in Houston: 2D Density Layer .pull-left[ ```r houston_plot + geom_density_2d_filled( aes(lon, lat), alpha = 0.5, data = robberies, show.legend = FALSE ) + scale_fill_brewer( palette = "YlOrRd", direction = 1 ) ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-18-1.png" width="345.6" style="display: block; margin: auto;" /> ] .footnote[ This example was adapted from <https://github.com/dkahle/ggmap/>. ] --- ## Geocoding (Address Data) Geocoding converts addresses or names of places into **coordinates**. Many options are available; here we use the [tidygeocoder package](https://CRAN.R-project.org/package=tidygeocoder): ```r *lu <- tidygeocoder::geo("Lund University, Sweden") ``` .pull-left[ ```r lu_sf <- st_as_sf( lu, coords = c("long", "lat"), crs = 4326 ) ggplot(skane_map) + geom_sf() + geom_sf( data = lu_sf, cex = 3, col = "firebrick1" ) ``` ] .pull-right[ <img src="maps_files/figure-html/unnamed-chunk-20-1.png" width="324" style="display: block; margin: auto;" /> ] --- ## Projections The only truly accurate representation of the earth is as a **sphere**. We need to **project** this spherical surface onto a plane. .pull-left[ This leads to **distortions** in one (or several) of the following aspects: - areas - shapes - directions - distances The larger the scale of the map is, the worse the distortion becomes. Every map projection is a compromise. ] .pull-right[ <div class="figure" style="text-align: center"> <img src="images/mercator-tissot.png" alt="Tissot's indicatrices on the Mercator Projection of the world map." width="100%" /> <p class="caption">Tissot's indicatrices on the Mercator Projection of the world map.</p> </div> ] --- ### Projections With ggplot2 and sf Simplest approach is to find the EPSG code for the projection you want at <https://epsg.io/>. - copy-paste the first line, for instance "ESRI:54016" for the World Gall Stereographic projection - add `coord_sf(crs = sf::st_crs("ESRI:54016"))` to the ggplot2 call ```r # base map # default projection: WGS84 (x = longitude, y = latitude) world_map <- world %>% filter(continent != "Antarctica") %>% ggplot() + geom_sf() + theme_bw() ``` --- ### Conformal - preserves angles (shapes) locally ```r library(sf) # for st_crs world_map + coord_sf(crs = st_crs("EPSG:3857")) ``` <div class="figure" style="text-align: center"> <img src="maps_files/figure-html/unnamed-chunk-23-1.png" alt="Mercator" width="504" /> <p class="caption">Mercator</p> </div> --- ### Equal-Area - preserves area measure ```r world_map + coord_sf(crs = st_crs("ESRI:54009")) ``` <div class="figure" style="text-align: center"> <img src="maps_files/figure-html/unnamed-chunk-24-1.png" alt="Mollweide" width="720" /> <p class="caption">Mollweide</p> </div> --- ### Equidistant - preserves distance between any two points ```r world_map + coord_sf(crs = st_crs("ESRI:54032")) ``` <div class="figure" style="text-align: center"> <img src="maps_files/figure-html/unnamed-chunk-25-1.png" alt="Azimuthal Equidistant" width="360" /> <p class="caption">Azimuthal Equidistant</p> </div> --- ### Compromises - tries to compromise distortions among the various aspects ```r world_map + coord_sf(crs = st_crs("ESRI:54030")) ``` <div class="figure" style="text-align: center"> <img src="maps_files/figure-html/unnamed-chunk-26-1.png" alt="Robinson" width="720" /> <p class="caption">Robinson</p> </div> --- ### More Projections See <https://en.wikipedia.org/wiki/List_of_map_projections> for an exhaustive list of different map projections. Many projections have parameters that allow fine-grained control, but this is beyond the scope here. <div class="figure" style="text-align: center"> <img src="images/projection-waterman.png" alt="The Waterman Butterfly Projection. A compromise projection." width="70%" /> <p class="caption">The Waterman Butterfly Projection. A compromise projection.</p> </div> <!-- --- --> <!-- ## References --> <!-- ```{r, results = "asis", echo=FALSE} --> <!-- PrintBibliography(bib) --> <!-- ``` -->