class: middle, center, title-slide .title[ # Color ] .subtitle[ ## Data Visualization ] .author[ ### Johan Larsson ] .author[ ### Behnaz Pirzamanbein ] .institute[ ### The Department of Statistics, Lund University ] --- ## Color in Visualizations How do we choose colors appropriately? We need to pay attention to data type! -- .pull-left[ ### Sequential Palette <img src="color_files/figure-html/unnamed-chunk-1-1.png" width="360" style="display: block; margin: auto;" /> - imply order among values ### Qualitative Palette <img src="color_files/figure-html/unnamed-chunk-2-1.png" width="360" style="display: block; margin: auto;" /> - imply no ordering ] .pull-right[ ### Diverging Palette <img src="color_files/figure-html/unnamed-chunk-3-1.png" width="360" style="display: block; margin: auto;" /> - emphasis on low, mid, and high ] --- ## Natural Color Mappings Sometimes, the type of data implies a natural color mapping, such as temperature, geographical features, or political affiliation. <div class="figure" style="text-align: center"> <img src="images/europe-forest.png" alt="Forest cover in Europe (www.eea.europa.eu)." width="70%" /> <p class="caption">Forest cover in Europe (www.eea.europa.eu).</p> </div> --- ## ggplot2 ggplot2 tries to **guess** which color palette to use: - ordinal and numerical variables are mapped to a **sequential** palette - unordered categorical variables are mapped to a **qualitative** palette but ggplot2 never knows when to map a variable a **diverging** palette, or which direction of values indicate strongest intensity. -- ### Applying a Color Palette in ggplot2 - `scale_colour_*` for color mappings: - `geom_point()`, - `geom_line()`, - etc - `scale_fill_*` for fill mappings - `geom_col()`, - `geom_tile()`, - etc --- ## Sequential Palettes A sequential palette - is typically used for data that is ordered from low to high. - consists of colors with varying lightness, progressing from light to dark. **Examples:** population count, quality rating, size ```r ggplot(diamonds, aes(price, fill = cut)) + geom_histogram(binwidth = 1000) + * scale_fill_ordinal() ``` <img src="color_files/figure-html/unnamed-chunk-5-1.png" width="648" style="display: block; margin: auto;" /> --- ## Diverging Palettes Diverging palettes place equal emphasis on the middle, low, and high ends of the scale. In ggplot2: - `scale_fill_gradient2()` - `scale_color_gradient2()` **Examples:** temperature (celsius), budget balance, correlations .pull-left[ ```r # see source for dataset mtcars_cor %>% ggplot(aes(var1, var2, fill = cor)) + geom_tile() + * scale_fill_gradient2() + labs(x = NULL, y = NULL) ``` ] .pull-right[ <div class="figure" style="text-align: center"> <img src="color_files/figure-html/unnamed-chunk-7-1.png" alt="Heatmap of correlation matrix." width="324" /> <p class="caption">Heatmap of correlation matrix.</p> </div> ] --- ## Qualitative Palettes Colors should be as distinct as possible and should not signal differences in magnitude. **Examples:** political party affiliation, gender ```r ggplot(msleep, aes(bodywt, sleep_total, color = vore)) + geom_point() + scale_x_log10() + * scale_color_discrete() ``` <img src="color_files/figure-html/unnamed-chunk-8-1.png" width="576" style="display: block; margin: auto;" /> --- ## Color Blindness .pull-left[ red-green color blindness affects 8% of men and 0.5% of females<sup>1</sup> Several different types: - protanotopia - deuteranopia - protanomaly - deuteranomaly - tritanopia - tritanomaly ] .pull-right[ <div class="figure" style="text-align: center"> <img src="images/ishihara.png" alt="Ishihara color test. What number do you see? (en.wikipedia.org/wiki/Color_blindness)" width="455" /> <p class="caption">Ishihara color test. What number do you see? (en.wikipedia.org/wiki/Color_blindness)</p> </div> ] .footnote[ <sup>1</sup> Of people of Northern European descent. ] --- ### Simulating Color Blindness ```r library(colorBlindness) p <- ggplot(mpg, aes(hwy, fill = drv)) + geom_density(alpha = 0.5) *cvdPlot(p) ``` <img src="color_files/figure-html/unnamed-chunk-10-1.png" width="720" style="display: block; margin: auto;" /> --- ## ColorBrewer ColorBrewer, based on work on color use in maps <a name=cite-harrower2003></a>([Harrower and Brewer, 2003](https://www.tandfonline.com/doi/abs/10.1179/000870403235002042)), is applicable to most types of plots. - sequential, diverging, and qualitative palettes, many tailored to individuals with color blindness - requires [RColorBrewer](https://CRAN.R-project.org/package=RColorBrewer) package .pull-left[ - `scale_color_brewer()`: discrete (integer, categorical) data - `scale_color_distiller()`: continuous data ```r p + scale_fill_brewer( palette = "Accent" ) ``` ] .pull-right[ <img src="color_files/figure-html/unnamed-chunk-11-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- <div class="figure" style="text-align: center"> <img src="color_files/figure-html/unnamed-chunk-12-1.png" alt="All the palettes in ColorBrewer." width="720" /> <p class="caption">All the palettes in ColorBrewer.</p> </div> --- ## Viridis **Sequential** palettes created by Stéfan van der Walt and Nathaniel Smith for the Python matplotlib library - are perceptually uniform and great for individuals with color blindness. - requires [viridis](https://CRAN.R-project.org/package=viridis) (or [viridisLite](https://CRAN.R-project.org/package=viridisLite)) .pull-left[ - `scale_viridis_d`: discrete data - `scale_viridis_c`: continuous data ```r tibble(x = rnorm(1e4), y = rnorm(1e4)) %>% ggplot(aes(x, y)) + geom_hex() + * scale_fill_viridis_c() + coord_fixed() ``` ] .pull-right[ <img src="color_files/figure-html/unnamed-chunk-13-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- <div class="figure" style="text-align: center"> <img src="color_files/figure-html/unnamed-chunk-14-1.png" alt="Palettes from the viridis package." width="576" /><img src="color_files/figure-html/unnamed-chunk-14-2.png" alt="Palettes from the viridis package." width="576" /><img src="color_files/figure-html/unnamed-chunk-14-3.png" alt="Palettes from the viridis package." width="576" /><img src="color_files/figure-html/unnamed-chunk-14-4.png" alt="Palettes from the viridis package." width="576" /><img src="color_files/figure-html/unnamed-chunk-14-5.png" alt="Palettes from the viridis package." width="576" /> <p class="caption">Palettes from the viridis package.</p> </div> --- ## Other Considerations ### 1. Greyscale Color visualizations are still sometimes printed in grayscale; therefore, it may be a good idea to use palettes that vary in lightness. <div class="figure" style="text-align: center"> <img src="color_files/figure-html/unnamed-chunk-15-1.png" alt="The same plot in original colors and greyscale." width="720" /> <p class="caption">The same plot in original colors and greyscale.</p> </div> --- ### 2. Aesthetics Aesthetics are **subjective** and **fickle**, yet they remain important to consider. Beautifully crafted visualizations will always attract more attention. <div class="figure" style="text-align: center"> <img src="color_files/figure-html/unnamed-chunk-16-1.png" alt="Which would you prefer?" width="720" /> <p class="caption">Which would you prefer?</p> </div> --- ## References <a name=bib-harrower2003></a>[Harrower, M. and C. A. Brewer](#cite-harrower2003) (2003). "ColorBrewer.Org: An Online Tool for Selecting Colour Schemes for Maps". In: _The Cartographic Journal_ 40.1, pp. 27-37. ISSN: 0008-7041. DOI: [10.1179/000870403235002042](https://doi.org/10.1179%2F000870403235002042). URL: [https://www.tandfonline.com/doi/abs/10.1179/000870403235002042](https://www.tandfonline.com/doi/abs/10.1179/000870403235002042) (visited on Sep. 16, 2020).