1 Carbon Dioxide Uptake in Grass Plants

In this example, we’ll take a look at an experiment on cold tolerance of the grass species Echinochloa crus-galli.

A specimen of Echinochloa crus-galli.

Figure 1: A specimen of Echinochloa crus-galli.

1.1 Data

The data we are going to look at is available in the CO2 data set from data sets (a package that is part of the base R distribution). There are five variables:

  • Plant: an ordered factor giving a unique identifier for each plant

  • Type: a factor with levels giving the origin of the plant (Quebec or Mississippi)

  • Treatment: a factor indicating if the sample was chilled or not

  • conc: a numeric vector of ambient carbon dioxide concentrations (mL/L)

  • uptake: a numeric vector of carbon dioxide uptake rates (umol/m^2 sec)

The goal of the experiment was to investigate how carbon dioxide uptake is affected based on the ambient carbon dioxide concentration (conc), the origin of the plant Type, and Treatment (whether the plant was chilled overnight before the experiment started or not).

1.2 Simple Scatter Plot

Let’s start with the basics and just look at a scatter plot of the uptake and ambient concentration. We’ll map uptake to the y axis, and concentration to the x axis. Before starting, we’ll rename some variables for clearer default axis labels and convert Plant from an ordered to an unordered variable, as it’s not ordinal in nature. We’ll call the new data set co2.

library(tidyverse)

co2 <-
  CO2 %>%
  as_tibble() %>%
  rename(Concentration = conc, Origin = Type, Uptake = uptake) %>%
  mutate(Plant = factor(Plant, ordered = FALSE))

ggplot(co2, aes(Concentration, Uptake)) +
  geom_point()
CO2 uptake and ambient CO2 concentration.

Figure 2: CO2 uptake and ambient CO2 concentration.

1.3 Adding Lines

However, using only the point geom here is not a good design choice, since our data is actually matched, meaning that the observations belong to specific plants, and there are multiple measurements for each plant. There are several ways to show this. For instance, we could use color, facets, or simply a line geom with mappings to the group aesthetic to address this issue. In this case, we will try the latter option, but feel free to experiment with the other ideas as well.

ggplot(co2, aes(Concentration, Uptake, group = Plant)) +
  geom_point() +
  geom_line()
Ambient CO2 concentration and uptake for 12 plants in different conditions and from different origins.

Figure 3: Ambient CO2 concentration and uptake for 12 plants in different conditions and from different origins.

Higher concentrations of CO2 evidently result in increased uptake of the same, which is not surprising. What kind of relationship would you describe this as? Please, take a moment to think about this.

1.4 Does Overnight Cooling Matter?

Until now, we haven’t included the other aspects of the experiment in our visualization. Let’s do that now. We will start by incorporating the cooling condition by mapping it to color. Also, it’s worth nothing that we have a natural color association here, with blue representing the cool condition, which we will certainly use.

ggplot(co2, aes(Concentration, Uptake, group = Plant, color = Treatment)) +
  geom_point() +
  geom_line() +
  scale_color_manual(values = c("dark orange", "steel blue"))
Ambient CO2 concentration and uptake for 12 plants under two different treatments (chilled overnight before the experiment or not).

Figure 4: Ambient CO2 concentration and uptake for 12 plants under two different treatments (chilled overnight before the experiment or not).

1.5 Plant Origin

It appears that the uptake is generally improved in the non-chilled arm of the experiment, but there are two clear clusters here. Let’s explore whether the origin of the plants might help explain this bi-modal distribution.

ggplot(co2, aes(Concentration, Uptake, group = Plant, color = Treatment)) +
  geom_point() +
  geom_line() +
  facet_wrap("Origin") + # or vars(Origin) or ~ Origin
  scale_color_manual(values = c("dark orange", "steel blue"))
Ambient CO2 concentration and uptake for 12 plants under two different treatments (chilled overnight before the experiment or not) and from two different origins.

Figure 5: Ambient CO2 concentration and uptake for 12 plants under two different treatments (chilled overnight before the experiment or not) and from two different origins.

Yes, indeed, it appears that the origin of the plants explains this precisely. As we know, Mississippi is a much warmer state than Quebec, so it makes sense that we observe this adaptation in the plants in Quebec.

This example demonstrates the importance of exploring multivariate relationships by effectively conditioning your visualizations on certain variables.

2 Source Code

The source code for this document is available at https://github.com/stat-lu/dataviz/blob/main/worked-examples/plants.Rmd.