class: middle, center, title-slide .title[ # Scales ] .subtitle[ ## Data Visualization ] .author[ ### Johan Larsson ] .author[ ### Behnaz Pirzamanbein ] .institute[ ### The Department of Statistics, Lund University ] --- ## Scales and Ranges Choosing appropriate scales can make or break a visualization! One must be **knowledgeable** about what the data represent. Sometimes the choice is obvious, other times **tricky** to pick scales appropriately. .pull-left[ <img src="scales_files/figure-html/unnamed-chunk-1-1.png" width="345.6" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="scales_files/figure-html/unnamed-chunk-2-1.png" width="345.6" style="display: block; margin: auto;" /> ] .center[*Number of reported crimes in Sweden (https://www.bra.se).*] --- ## Scales and Limits It is not always appropriate to include entire range, such as when - 0 is not a meaningful value for the variable, - only a small subset of the range is attained in practice, or - when even small differences matter a lot. -- <div class="figure" style="text-align: center"> <img src="scales_files/figure-html/unnamed-chunk-3-1.png" alt="Change in reported crimes between 2019 and 2020 in Sweden (https://bra.se)." width="345.6" /><img src="scales_files/figure-html/unnamed-chunk-3-2.png" alt="Change in reported crimes between 2019 and 2020 in Sweden (https://bra.se)." width="345.6" /> <p class="caption">Change in reported crimes between 2019 and 2020 in Sweden (https://bra.se).</p> </div> --- ## Scales and Limits when using Color Choosing limits for color-mapped variables can also be problematic! <div class="figure" style="text-align: center"> <img src="scales_files/figure-html/unnamed-chunk-4-1.png" alt="Life expectancy in the US." width="720" /> <p class="caption">Life expectancy in the US.</p> </div> --- ## Scales and Limits when using Color Choosing limits for color-mapped variables can also be problematic! <div class="figure" style="text-align: center"> <img src="scales_files/figure-html/unnamed-chunk-5-1.png" alt="Life expectancy in the US with other limits [66, 80]." width="720" /> <p class="caption">Life expectancy in the US with other limits [66, 80].</p> </div> --- ## Size When mapping a variable to size, map it to **area** (not height or width). It usually becomes a bubble chart. ```r p <- ggplot(mtcars, aes(wt, mpg, size = hp)) + geom_point() ``` .pull-left[ ```r *p + scale_size_area() ``` <img src="scales_files/figure-html/unnamed-chunk-7-1.png" width="345.6" style="display: block; margin: auto;" /> ] .pull-right[ ```r *p + scale_radius() ``` <img src="scales_files/figure-html/unnamed-chunk-8-1.png" width="345.6" style="display: block; margin: auto;" /> ] --- ## The Lie Factor .pull-left[ The lie factor is another concept introduced by Edward Tufte $$ \text{lie factor} = \frac{\text{effect in visualization}}{\text{size of effect in data}} $$ - should strive for a lie factor of 1 ] .pull-right[ <div class="figure" style="text-align: center"> <img src="images/liefactor_doctor.jpg" alt="Data mapped to height and width of the doctor image, inducing a lie factor of 2.8." width="90%" /> <p class="caption">Data mapped to height and width of the doctor image, inducing a lie factor of 2.8.</p> </div> ]