geom_bar()
and our EPI dataset.library(tidyverse)
epi <- readRDS("./data/epir.RDS")
Bar plots are great for visually comparing values of categorical variables. Say you want to use a bar plot to visually compare the population of several countries.
epi_sub <- epi %>%
filter(country %in% c("China", "India", "United States of America", "Brazil"), year == 2016)
ggplot(epi_sub) +
geom_bar(aes(x = country, y = POP), stat = "identity") +
xlab("") +
ylab("Population in 2016")
Not a particularly beautiful plot, but we can clearly see that India and China have massive populations compared to the other countries we’ve selected. There are a few things we can do to make this look much better. First, we can reorder the bars by population:
ggplot(epi_sub) +
geom_bar(aes(x = reorder(country, desc(POP)), y = POP), stat = "identity") +
xlab("") +
ylab("Population in 2016")
For ascending order, just drop the desc()
function. Another nice trick is to flip the bar plot to make the axes a bit easier to read:
ggplot(epi_sub) +
geom_bar(aes(x = reorder(country, POP), y = POP), stat = "identity") +
xlab("") +
ylab("Population in 2016") +
coord_flip()
We can change the fill color, border color, and transparency of all bars as follows:
ggplot(epi_sub) +
geom_bar(aes(x = reorder(country, POP), y = POP), stat = "identity",
fill = "dodgerblue", color = "grey40", alpha = 0.5) +
xlab("") +
ylab("Population in 2016") +
coord_flip()
Here, the fill
argument refers to the color of the bars, color
refers to the line around the bars, and alpha
refers to the transparency of the fill.
We can use the scales
package to remove the scientific notation on the x-axis and also improve some of our labeling and our theme:
library(scales)
ggplot(epi_sub) +
geom_bar(aes(x = reorder(country, POP), y = POP), stat = "identity",
fill = "dodgerblue", color = "grey40", alpha = 0.5) +
xlab("") +
ylab("") +
labs(title = "2016 Population") +
coord_flip() +
scale_y_continuous(labels = comma) +
theme_minimal()