2024 Du Bois Challenge

The Data Visualisation Challenge

Simisani Nokulunga Ndaba

Talk Overview

  • Du Bois Challenge.
  • who was W.E.B Du Bois.
  • 2024 Du Bois Challenge.
  • Recreated plots using R.
  • Other data viz challenges

Who am I

  • Simisani Nokulunga Ndaba

  • Teaching Assistant in the Department of Computer Science at the University of Botswana

  • Founder & co-organiser of R-Ladies Gaborone

  • An occasional blogger

  • Enjoy creating data visualisations

My Data viz

#DuBoisChallenge

Background

  • 1st challenge in February 2021.
  • Started by Allen Hillery and Sekou Tyler.
  • Guessing: It started as part of Black History month?
  • Anthony Starks recreations
  • From 7 weeks to 10 weeks

Who was W.E.B Du Bois

W.E.B Du Bois

  • born February 23, 1868, Massachusetts, U.S. - died August 27, 1963, Accra Ghana

  • First Black person to earn a PhD from Harvard

  • Black American activist and Sociologist.

  • co-founder of the NAACP.

1900 Paris Exposition

Paris Exposition Universelle / Paris Expo

  • 5th most adventurous World Fair.

  • Showcased world-leading art, technology and industry

  • Same year, Paris hosted the Olympic games.

Warning: Outdated Titles, Words and Names used in the charts were used to reference black people

Challenge 01 - week 1

Negro Population of Georgia By Counties _ plate 6

Challenge 01 - R Code

install.packages("ggplot2")
install.packages("dplyr")
install.packages("maps")
install.packages("patchwork")
install.packages("cowplot")

                          # Load libraries
library(tidyverse)   
library(ggplot2)
library(dplyr)       #sometimes tidyverse doesn't work
library(ggforce)  #geom_circle()
library(maps)
library(cowplot)

                              #getting the 1870 data
data1870 = read.csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge01/data1870.csv")

                              #getting the 1880 data
data1880 = read.csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge01/data1880.csv")

                               #data wrangling 1870 data
names(data1870) <- tolower(names(data1870))
colnames(data1870) <- c('subregion','population') 
data1870$subregion<-tolower(data1870$subregion)

                              #data wrangling 1880 data
names(data1880) <- tolower(names(data1880))
colnames(data1880) <- c('subregion','population') 
data1880$subregion<-tolower(data1880$subregion)

                             #using map() to get subregion
world <- map_data("world")
usa_states <- map_data("state")
georgia <- subset(usa_states, region %in% c("georgia"))

usa_counties <- map_data("county")
georgia_counties <- subset(usa_counties, region == "georgia")

                                      #merging columns
data1870_georgia_counties <- left_join(georgia_counties, data1870,by="subregion")
data1880_georgia_counties <- left_join(georgia_counties, data1880,by="subregion")

                                     #ordering columns
data1870_georgia_counties <- data1870_georgia_counties[,c(2,3,4,5,6,1,7)]
data1880_georgia_counties <- data1880_georgia_counties[,c(2,3,4,5,6,1,7)]

                              #changing population values in 1870
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 1000'] <- 1000
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 1000 - 2500'] <- 1000 - 2500
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 2500 - 5000'] <- 2500 - 5000
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 5000 - 10000'] <- 5000 - 10000
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 10000 - 15000'] <- 10000 - 15000
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 15000 - 20000'] <- 15000 - 20000
data1870_georgia_counties$population[data1870_georgia_counties$population == '> 20000 - 30000'] <- 20000 - 30000

                             #changing population values in 1880
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 1000'] <- 1000
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 1000 - 2500'] <- 1000 - 2500
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 2500 - 5000'] <- 2500 - 5000
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 5000 - 10000'] <- 5000 - 10000
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 10000 - 15000'] <- 10000 - 15000
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 15000 - 20000'] <- 15000 - 20000
data1880_georgia_counties$population[data1880_georgia_counties$population == '> 20000 - 30000'] <- 20000 - 30000

                                        #plotting 1870
a <- ggplot(data = data1870_georgia_counties, 
       aes(x = long, y = lat, group = group, fill = population)) + 
  geom_polygon(data = data1870_georgia_counties, 
               mapping = aes(long, lat, group = group),  color = "black") +
  scale_fill_manual(values = c("white","darkgreen", "yellow", "#CC9966", "chocolate4", "midnightblue","lightsalmon", "#FF0000", "white")) +
  coord_fixed(1.4) + 
  theme_void() +
  labs(#title="NEGRO POPULATION OF GEORGIA BY COUNTIES",
       fill = "population") +
  theme(legend.position="none") +
  annotate(geom = "text", 
    x = -84.6, y = 35.3, 
    size = 4, color = "black", lineheight = .9,
    label = "1870" , fontface = "bold")
a


                                    #Top right Legend:
df <- data.frame(x=c(1, 2, 2, 3, 3, 4, 8, 10),
                 y=c(2, 4, 5, 4, 7, 9, 10, 12)) 
b <- ggplot(data = df, aes(x, y)) +
  theme_void() +
  geom_circle(aes(x0=1, y0=8, r=0.2),                  #ggforce
              fill='midnightblue', inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=6, r=0.2),
              fill='chocolate4',inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=4, r=0.2),
              fill='tan', inherit.aes=FALSE) +
  scale_x_continuous(expand = c(0, 0), 
                     #breaks = seq(0, 10, 1000), 
                     limits = c(0, 10)) +
  coord_fixed() +
  annotate(geom = "text", 
         x = 3.5 , y = 4, 
         size = 4 , color = "black", lineheight = .6,
         label = "10,000 TO 15 000") + 
  annotate(geom = "text", 
           x = 3.5, y = 6, 
           size = 4, color = "black", lineheight = .6,
           label = "15,000 TO 20 000") +
  annotate(geom = "text", 
           x = 4, y = 8, 
           size = 4, color = "black", lineheight = .6,
           label = "BETWEEN 20,000 TO 30 000")
b


                                #Bottom left Legend:
dff <- data.frame(x=c(1, 2, 2, 3, 3, 4, 8, 10),
                  y=c(2, 4, 5, 4, 7, 9, 10, 12))
z <- ggplot(data = dff, aes(x, y)) +
  theme_void()+
  geom_circle(aes(x0=1, y0=11, r=0.5),
              fill='#FF0000', inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=8, r=0.5),
              fill='lightsalmon', inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=5, r=0.5),
              fill='yellow',inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=2, r=0.5),
              fill='darkgreen',inherit.aes=FALSE) +
  scale_x_continuous(expand = c(0, 0), 
                     #breaks = seq(0, 10, 1000), 
                   limits = c(0, 10)) +
 coord_fixed() +
  annotate(geom = "text", 
           x = 6, y = 2, 
           size = 4, color = "black", lineheight = .6,
           label = "UNDER 1,000") + 
  annotate(geom = "text", 
           x = 6, y = 5, 
           size = 4, color = "black", lineheight = .6,
           label = "1,000 TO 20 000") +
  annotate(geom = "text", 
           x = 6, y = 8, 
           size = 4, color = "black", lineheight = .6,
           label = "20,000 TO 30 000") +
  annotate(geom = "text", 
           x = 6, y = 11, 
           size = 4, color = "black", lineheight = .6,
           label = "10,000 TO 15 000")
z


                                      #plotting 1880
d <- ggplot(data = data1880_georgia_counties, 
            aes(x = long, y = lat, group = group, fill = population)) + 
  geom_polygon(data = data1880_georgia_counties, 
               mapping = aes(long, lat, group = group),  color = "black") +
  scale_fill_manual(values = c("darkgreen","yellow", "#CC9966", "chocolate4","#330033",  "lightsalmon" , "#FF0000", "white")) +
  coord_fixed(1.4) + 
  theme_void() +
  labs(fill = "population") +
  theme(legend.position="none") +
  annotate(geom = "text", 
           x = -84.6, y = 35.5, 
           size = 4, color = "black", lineheight = .9,
           label = "1880" , fontface = "bold")
d

                              #cowplot to combine plots 
title <- ggdraw() + 
  draw_label(
    "NEGRO POPULATION OF GEORGIA BY COUNTIES",
    fontface = 'bold',
    x = 0,
    hjust = 0  ) 
e <- plot_grid(a,b,z,d, ncol = 2, align = 'tb',
          # rel_heights = c(5,3),
          rel_widths = c(5,4))
e

plot <- plot_grid(title,e,ncol = 1,
                  # rel_heights values control vertical title margins
                  rel_heights = c(0.1, 1))+
  theme(plot.margin = margin(0, 0, 0, 7),
        text = element_text('mono'),
        panel.background = element_blank(),
        plot.title = element_text(hjust = 0.5, size=23, face = "bold"),
        plot.background = element_rect(fill = 'papayawhip'))
plot

p2 <- ggdraw(add_sub(plot, "\n\nsource:Du Bois 1900 Exposition Paris| Graphic by: Simisani Ndaba"))
p2

ggsave("challenge01.png", width = 25, height = 15, units = "cm")

Challenge 01 -R code output

Challenge 01 - Original and Recreation

Challenge 02 - week 2

Slaves and Free Negroes _ plate 12

Challenge 02 - week 2

install.packages("tidyverse")
install.packages("ggplot2")

library(tidyverse)
library(ggplot2)

challenge02 <- read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge02/challenge02data.csv")

                                               #plot
ggplot(data = challenge02, aes(x = Year, y = Free)) +
  geom_line() +
  geom_ribbon(data = NULL, aes(ymin = Free, ymax = Inf), fill = "black", alpha = 0.8) +
  geom_ribbon(data = NULL, aes(ymin = -Inf, ymax = Free), fill = "red", alpha = 0.9) +
  theme_minimal() +
  labs(
    title = "SLAVES AND FREE NEGROES.",
    caption = "Source: Du Bois Plate 12 | Graphic: Simisani Ndaba") +
  theme(
    plot.title = element_text(hjust = 0.5, family = "mono", 
                              size = 23, face = "bold"),
    plot.background = element_rect(fill = 'papayawhip'),
    plot.caption = element_text(hjust = 0.5, family = "mono", 
                                size = 15, face = "bold"),
    axis.title.x=element_blank(),
    axis.title.y = element_blank(),
    axis.text.x=element_blank(),
    axis.ticks.x=element_blank()) +
  scale_x_reverse(
    breaks = seq(0, 1870, 10),
    limits =  c(1870, 1789)) +               # c(1870, 1790)) +
  scale_y_reverse(
    breaks = seq(0, 1.8, 0.5),
    limits = c(1.8, 0.5)) +
  annotate(geom = "text", label="3%",size=4,x=1789,y=1.8)+
  annotate(geom = "text", label="2%",size=4,x=1789,y=1.5)+
  annotate(geom = "text", label="1%",size=4,x=1789,y=1)+
  annotate(geom = "text", label="Percent of\nFree Negroes\n1.3%",size=3,x=1789,y=0.5)+
  annotate(geom = "text", label="1.7%",size=3,x=1800,y=0.5)+
  annotate(geom = "text", label="1.7%",size=3,x=1810,y=0.5)+
  annotate(geom = "text", label="1.2%",size=3,x=1820,y=0.5)+
  annotate(geom = "text", label="0.8%",size=3,x=1830,y=0.5)+
  annotate(geom = "text", label="0.9%",size=3,x=1840,y=0.5)+
  annotate(geom = "text", label="0.7%",size=3,x=1850,y=0.5)+
  annotate(geom = "text", label="0.8%",size=3,x=1860,y=0.5)+
  annotate(geom = "text", label="100%",size=3,x=1870,y=0.5)+
  coord_flip()


                                #saving the plot
ggsave("challenge02.png", width=8, height=15)

Challenge 02 - R code output

Challenge 02 -Original and Recreation

Challenge 05 - week 5

Race Amalgamation in Georgia_plate 13

Challenge 05 - R code

library(tidyverse)
library(ggplot2)

challenge05 <- read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge05/challenge05data.csv")

                         #Create the stacked bar plot
plot <- ggplot(challenge05, aes(x = "", y = Percentage, fill = Category)) +
  geom_bar(stat = "identity", position = "stack", width = 0.4) +
  labs(x = "", y = "Percentage", 
       title = "RACE AMALGAMATION IN GEORGIA.",
       subtitle = "BASED ON A STUDY ON 40,000 OF NEGRO DESECENT.",
       caption = "Du Bois in Rstats by simisani.ndaba") +
  scale_fill_manual(values = c("black", "chocolate4", "yellow")) +
  theme_void() +
  theme(text = element_text('mono'),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size=18, face = "bold"),
        plot.subtitle = element_text(hjust = 0.5, size=10),
        plot.background = element_rect(fill = 'papayawhip'),)
 plot +  
   annotate("text", x = 0.55, y = 90, label = "BLACK.", size = 5, hjust = 0, fontface = "bold") +
   annotate("text", x = 0.60, y = 87, label = "IE.FULL-BLOOD.\nNEGROES.", size = 2, hjust = 0) +
   annotate("text", x = 0.95, y = 80, label = "44%", size = 5, hjust = 0, colour = "white")+
   
   annotate("text", x = 0.55, y = 50, label = "BROWN.", size = 5, hjust = 0, fontface = "bold") +
   annotate("text", x = 0.60, y = 44, label = "IE.PERSONS WITH\nSOME WHITE BLOOD\nOR DESCENDENTS\nOF LIGHT COLORED\nAFRICANS", size = 2,hjust = 0) +
   annotate("text", x = 0.95, y = 35, label = "40%", size = 5, hjust = 0, colour = "red")+
   
   annotate("text", x = 0.55, y = 16, label = "YELLOW.", size = 5, hjust = 0, fontface = "bold") +
   annotate("text", x = 0.60, y = 12, label = "IE. PERSONS WITH\nMORE THAN\nNEGRO BLOOD.", size =2,hjust = 0) +
   annotate("text", x = 0.95, y = 8, label = "16%", size = 5, hjust = 0, colour = "black")

 ggsave("challenge05.png", width = 8, height = 8)

Challenge 05 - R code output

Challenge 05 -Original and Recreation

Challenge 07 - week 7

Illiteracy of the American Negroes compared with that of other nations _ plate 47

Challenge 07 - R code

install.packages("tidyverse")
install.packages("ggplot2")

library(tidyverse)
library(ggplot2)

#extract and read data
challenge07 <- read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge07/challenge07data.csv")

challenge07 <- rbind(c("Romaine",72.9363), challenge07)

colnames(challenge07)[1] <- "Country" 
colnames(challenge07)[2] <- "Proposion" 

literacy <- c("other", "other", "other","negroes","other","other","other","other","other","other")
challenge07['literacy'] <- literacy


view(challenge07)

challenge07$Proposion <- as.numeric(challenge07$Proposion) * 10

challenge07 <- arrange(challenge07, Proposion)
challenge07$Country <- factor(challenge07$Country, levels = challenge07$Country)
view(challenge07)


ggplot(challenge07, aes(Country, Proposion, fill = literacy)) + 
  theme_minimal() +
  labs(
    title = "Illiteracy of the American Negroes compared with that of other nations.\n",
    subtitle = "Proportion d' illettrés parmi les Nègres Americains comparée à celle des autres nations.\n\nDone by Atlanta University.\n\n\n\n",
    caption = "Source: Du Bois Paris Exposition- Plate 47\nCreated by Simi Ndaba") + 
  theme(text = element_text('mono'),
        legend.position = "none",
        plot.title = element_text(hjust = 0.5, size=16),
        plot.subtitle = element_text(hjust = 0.5, size=12),
        plot.caption = element_text(hjust = 0.5, size=12),
        plot.background = element_rect(fill = 'papayawhip'),
        axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        axis.title.y=element_blank(),
        axis.text.y =element_text(size = 13)) +
  geom_col(width = 0.6,position = position_dodge(width=0.9)) +
scale_fill_manual(values = c("red", "darkgreen")) +
 # geom_bar(stat="identity", fill="green") + 
  coord_flip() 

ggsave("challenge07.png", width = 12, height = 15)

Challenges 07 - R code output

Challenges 07 -Original and Recreation

Challenge 08 - week 8

The Rise of Negroes From Slavery To Freedom In One Genration _ plate 50

Challenge 08 - R code

#install packages
install.packages("tidyr")
install.packages("readr")
install.packages("ggplot")
install.packages("cowplot")
install.ptachwork("patchwork")

#load and extract the libraries to use
library(tidyr)
library(readr)
library(ggplot2)
library(cowplot)
library(patchwork)

#read data using readr package 
challenge08 <- readr::read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge08/challenge08data.csv")

#first data frame
df <- data.frame(x=c(1, 4),
                  y=c(1,4)) 
c <- ggplot(data = df, aes(x, y)) +
  theme_void() +
 scale_y_continuous(limits = c(0,4 )) +
  scale_x_continuous(limits = c(0,4)) +
  annotate(geom = "text", x = 1.14, y = 3, size = 3, color = "black", lineheight = .6,
           label = "IN 1890 NEARLY ONE FIFTH OF THEM OWNED THEIR OWN HOMES AND FARMS.
           \n   THIS ADVANCE WAS ACCOMPLISHED ENTIRELY WITHOUT STATE AID, AND IN THE")+
  annotate(geom = "text", x = 0.6, y = 2.6,  size = 3, color = "black", lineheight = .6,
           label = "FACE OF PROSCRIPTIVE LAWS.")+
  
  annotate(geom = "text", x = 1.1, y = 2.2, size = 3, color = "black", lineheight = .6,
           label = "EN 1890 ENVIRON UN CINQUIÈME ÉTAIENT PROPRIÉTAIRES DE LEURS HABI-")+
  annotate(geom = "text",  x = 1.05, y = 1.8,  size = 3, color = "black", lineheight = .6,
           label = "ITATIONS ET DE LEURS FERMES. CE PROGRÈS S'EST ACCOMPLI SANS")+
  annotate(geom = "text",  x = 1.1, y = 1.5,  size = 3, color = "black", lineheight = .6,
           label = "SECOURS AUCUN DE L'ETAT ET EN PRÉSENCE DE LOIS OÉFAVOURABLES.")+
  
  annotate(geom = "text",  x = 0.9, y = 0.6,  size = 3, color = "black", lineheight = .6,
           label = "IN 1860 NEARLY 90% OF BLACKS WERE SLAVES.")+
  annotate(geom = "text", x = 1, y = 0.4,size = 3, color = "black", lineheight = .6,
           label = " EN 1860 ENVIRON 90% DES NÈGRES ÉTAIENT ESCLAVES")
c



#first data frame
df1 <- data.frame(
  Year = c(1860, 1890),
  Slave = c(89, NA),
  Free = c(11, NA)
)
#Reshape the data
df_long <- pivot_longer(df1, cols =-Year, names_to = "Category", values_to = "Value")

#Plot the stacked bar plot
c08_1860 <- ggplot(df_long, aes(x = Year, y = Value, fill = Category)) +
 geom_bar(stat = "identity") +
  annotate(geom = "text", x = 1860, y = 105,size = 6, color = "black", lineheight = .9,
           label = "1860" , fontface = "bold") +
  annotate(geom = "text", x = 1860, y = 95, size = 2, color = "black", lineheight = .9,
           label = "11%   FREE LABORERS\nOUVRIERS LIBRES", fontface = "bold") +
  scale_fill_manual(values = c("Slave" = "black", "Free" = "darkgreen")) +
 # scale_y_continuous(limits = c(0, 190)) +
 #scale_x_continuous(limits = c(1830, 1890)) +
  annotate(geom = "text", x = 1860, y = 50, size = 2, color = "red", lineheight = .9,
           label = "89%\nSLAVES\nESCLAVES", fontface = "bold") +
 # scale_fill_manual(values = c("Slave" = "black", "Free" = "darkgreen")) +
  theme_void() +
  theme(legend.position = "none")+
  theme(aspect.ratio = 2.5)  # Adjusting the width of the bar
c08_1860

 

#second data frame
df2 <- data.frame(
  Year = c(1860, 1890),
  Owners = c(19, NA),  
  Tenants = c(81, NA)  
)

# Reshape the data
df_long1 <- pivot_longer(df2, cols = -Year, names_to = "Category", values_to = "Value")

# Plot the stacked bar plot
c08_1890 <- ggplot(df_long1, aes(x = Year, y = Value, fill = Category)) +
  geom_bar(stat = "identity") +
  annotate(geom = "text", x = 1860, y = 105,size = 6, color = "black", lineheight = .9,
           label = "1890" , fontface = "bold") +
  annotate(geom = "text", x = 1860, y = 90,size = 2, color = "black", lineheight = .9,
           label = "19%\nPEASANT PROPRIETORS\nPAYSANS PROPRIETAIRES" , fontface = "bold") +
  annotate(geom = "text", x = 1860, y = 50, size = 2, color = "black", lineheight = .9,
           label = "81%\nTENANTS\nMÉTAYERS" , fontface = "bold") +
  scale_fill_manual(values = c("Owners" = "red", "Tenants" = "darkgreen")) +
  theme_void() +
  theme(legend.position = "none") +
  theme(aspect.ratio = 1.8)  # Adjusting the width of the bar
c08_1890




#combined plots using Cowplot package
plot <- plot_grid(
  c, c08_1890, c08_1860,align="hv",  rel_heights = c(4,3),rel_widths = c(7,2)
)
plot + plot_annotation(
  title = "THE RISE OF THE NEGROES FROM SLAVERY TO FREEDOM IN ONE GENERATION.\n
PROGRÈS GRADUEL DES NÈGRES DE LESCLAVAGE A LA LIBERTÈ EN UNE GÈNÈRATION.\n\n\n",
  subtitle = 'DONE BY ATLANTA UNIVERSITY',
  caption = 'Data: DuBois Paris Explosition Plate 50\nCreated by: Simisani Nokulunga Ndaba',
  theme = theme(plot.title = element_text(size = 15,hjust = 0.5,face = "bold"),
                plot.subtitle = element_text(size = 13,hjust = 0.5,vjust = 0.5,face = "bold"),
                plot.caption = element_text(size = 13,hjust = 1,face = "bold"),
                plot.background = element_rect(fill = 'papayawhip'),
                axis.title.x=element_blank(),
                axis.text.x=element_blank(),
                axis.ticks.x=element_blank(),
                axis.title.y=element_blank())) & 
  theme(text = element_text('mono'))

ggsave("challenge08.png", width = 10, height = 10)

Challenge 08 - R code output

Challenge 08 -Original and Recreation

Challenge 09 - week 9

Propotion of Freemen And Slaves Among American Negroes.

Challenge 09 - R code

#install packages to be used in the code
install.packages("ggplot2")
install.packages("tidyverse")

#load packages
library(ggplot2)
library(tidyverse)

#read and extract the data
challenge09 <- readr::read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge09/challenge09data.csv")


# Convert the dataset from wide to long format_
#easier to work with data in long format for certain types of analyses and visualizations, 
#especially when dealing with categorical variables or when you need to perform certain operations 
#like plotting multiple variables in a single graph.
data_long <- tidyr::pivot_longer(challenge09, -Year, names_to = "Status", values_to = "Count")



# Create the stacked area plot
ggplot(data_long, aes(x = Year, y = Count, fill = Status)) +
  geom_area(color = "black",width=-0.1) +
  scale_fill_manual(values = c("darkgreen", "black")) +
  annotate(geom = "text", label="SLAVES", fontface="bold",size=12,colour="white",x=1825,y=50)+
  annotate(geom = "text", label="ESCLAVES", fontface="bold",size=12,colour="white",x=1825,y=46)+
  annotate(geom = "text", label="FREE - LIBRE",fontface="bold",size=10,colour ="black",x=1825,y=95)+
  annotate(geom = "text", label="1790", fontface="bold",size=6,colour="black",x=1790,y=103)+
  annotate(geom = "text", label="8%", fontface="bold",size=5,colour="black",x=1790,y=93)+
  annotate(geom = "text", label="1800", fontface="bold",size=6,colour="black",x=1800,y=103)+
  annotate(geom = "text", label="11%", fontface="bold",size=5,colour="black",x=1800,y=90)+
  annotate(geom = "text", label="1810", fontface="bold",size=6,colour="black",x=1810,y=103)+
  annotate(geom = "text", label="13.5%", fontface="bold",size=5,colour="black",x=1810,y=88.5)+
  annotate(geom = "text", label="1820", fontface="bold",size=6,colour="black",x=1820,y=103)+
  annotate(geom = "text", label="13%", fontface="bold",size=5,colour="black",x=1820,y=89)+
  annotate(geom = "text", label="1830", fontface="bold",size=6,colour="black",x=1830,y=103)+
  annotate(geom = "text", label="14%", fontface="bold",size=5,colour="black",x=1830,y=88)+
  annotate(geom = "text", label="1840", fontface="bold",size=6,colour="black",x=1840,y=103)+
  annotate(geom = "text", label="13.%", fontface="bold",size=5,colour="black",x=1840,y=89)+
  annotate(geom = "text", label="1850", fontface="bold",size=6,colour="black",x=1850,y=103)+
  annotate(geom = "text", label="12%", fontface="bold",size=5,colour="black",x=1850,y=90)+
  annotate(geom = "text", label="1860", fontface="bold",size=6,colour="black",x=1860,y=103)+
  annotate(geom = "text", label="11%", fontface="bold",size=5,colour="black",x=1860,y=91)+
  annotate(geom = "text", label="1870", fontface="bold",size=6,colour="black",x=1870,y=103)+
  annotate(geom = "text", label="100%", fontface="bold",size=5,colour="black",x=1870,y=90)+
  theme_void()+
  labs(title = "PROPORTION OF FREEMEN AND SLAVES AMONG AMERICAN NEGROES .\nPROPORTION DES NÈGRES LIBRES ET DES ESCLAVES EN AMÉRIQUE.\n",
       subtitle = "DONE BY ATLANTA UNIVERSITY.\n",
       caption = "source: 1900 Paris Exposition Plate 51 | graphic: simindaba")+
  theme(plot.title = element_text(hjust=0.5,size = 18,family="mono",colour = "black", face = "bold"),
        plot.caption = element_text(hjust=0.5,family="mono",face="bold",size = 14, colour = "black"),
        plot.subtitle = element_text(hjust=0.5,family="mono",face="bold",size = 18, colour = "black"),
        plot.background = element_rect(fill = 'papayawhip'),
        legend.position = "none")

ggsave("challenge09.png",width = 3,height = 5)

Challenge 09 - R code output

Challenges 09 - Original and Recreation

Challenge 10 -week 10

A Series of Statistical Charts, Illustrating The Condition Of The Descendants Of Former African Slaves Now Resident In The United States of America _ plate 37

Challenge 10 -R code

#install.packages("ggplot","tidyverse","ggtext","patchwork","maps",..etc)
library(ggplot2)
library(tidyverse)
library(ggtext)
library(ggfx)
library(patchwork)
library(cowplot)
library(ggforce)
library(maps)

challenge10 <- readr::read_csv("https://raw.githubusercontent.com/sndaba/2024DuBoisChallengeInRstats/main/challenge10/challenge10data.csv")


                                         #title annotation
titlePlot <- data.frame(x=c(1, 2, 3, 4, 5),
                   y=c(2, 4, 6, 8, 12))
titleTop <- ggplot(data = titlePlot, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 6,family ="mono", color = "black", fontface="bold",lineheight = .6,x=1,y=3,
           label = "A SERIES OF STATISTICAL CHARTS ILLUSTRA-\n
           TING THE CONDITION OF THE DESCENDANTS OF FOR-\n
           MER AFRICAN SLAVES NOW RESIDENT IN THE UNITED STATES OF AMERICA.")

                                        #subtitle annotation
subtitlePlot <- data.frame(x=c(1, 2, 3, 4, 5),
                        y=c(2, 4, 6, 8, 12))
subtitleTop <- ggplot(data = subtitlePlot, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size =5, color = "red",lineheight = .6,x=1,y=3,family = "mono",
           label = "UNE SERIE DE CARTES ET DIAGRAMMES STATISTIQUES MONTRANT LA-\n
           CONDITION PRESENTE DES DESCENDANTS DES ANCIENS ESCLAVES AFRI-\n
           CAINS ACTUELLMENT ETABLIS DANS LES ETATS UNIS D AMERIQUE.")


                                      #Get map data for USA states
usa_map <- map_data("state")

# Define colors for each state (replace these with your desired colors)
state_colors <- c(
  "Alabama" = "red",
  "Arizona" = "lightblue",
  "Arkansas" = "lightgreen",
  "California" = "red",
  "Colorado" = "red",
  "Connecticut" = "gold",
  "Delaware" = "red",
  "Florida" = "grey",
  "Georgia" = "black",
  "Idaho" = "lightpink",
  "Illinois" = "seashell2",
  "Indiana" = "darkorange",
  "Iowa" = "seagreen1",
  "Kansas" = "seashell2",
  "Kentucky" = "saddlebrown",
  "Louisiana" = "sandybrown",
  "Maine" = "lightgreen",
  "Maryland" = "saddlebrown",
  "Massachusetts" = "red",
  "Michigan" = "navyblue",
  "Minnesota" = "blue",
  "Mississippi" = "lightblue",
  "Missouri" = "red",
  "Montana" = "seashell2",
  "Nebraska" = "gold",
  "Nevada" = "seashell2",
  "New Hampshire" = "gold",
  "New Jersey" = "lightgreen",
  "New Mexico" = "lightgreen",
  "New York" = "seashell2",
  "North Carolina" = "lightpink",
  "North Dakota" = "lightpink",
  "Ohio" = "lightblue",
  "Oklahoma" = "pink",
  "Oregon" = "lightgreen",
  "Pennsylvania" = "lightseagreen",
  "Rhode Island" = "navyblue",
  "South Carolina" = "gold",
  "South Dakota" = "saddlebrown",
  "Tennessee" = "navyblue",
  "Texas" = "gold",
  "Utah" = "saddlebrown",
  "Vermont" = "seashell2",
  "Virginia" = "salmon",
  "Washington" = "seagreen",
  "West Virginia" = "lightblue",
  "Wisconsin" = "gold",
  "Wyoming" = "navyblue"
)
state_colors <- setNames(state_colors, tolower(names(state_colors)))

                              # Plot the map with matching state names
us <- ggplot() +
  geom_polygon(data = usa_map, aes(x = long, y = lat, group = group, fill = region)) +
  scale_fill_manual(values = state_colors[usa_map$region]) +  # Assigning colors manually
  coord_fixed() + #1.3
  annotate(geom = "text",label="Centre of Negro Population.\nATLANTA UNIVERSITY.",
           x=-100, y=25,size=2.5,family = "mono")+
  theme(legend.position = "none")+
  theme_void() +
  guides(fill = FALSE)  # Remove the legend

                                     #map annotations bottom of map
map1 <- data.frame(x=c(1, 2, 3, 4, 5),
                     y=c(2, 4, 6, 8, 12))
mapann1 <- ggplot(data = map1, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 4, color = "black", lineheight = .6,x=1,y=3,family = "mono",
           label = "THE UNIVERSITY WAS FOUNDED IN 1867. IT HAS INSTRUCTED 6000 NEGRO STUDENTS.\n
           L'UNIVERSITE A ETE FONDEE EN 1867. ELLE A DONNE L'INSTRUCTION A'6000 ETUDIANTS NEGRES.\n
                              IT HAS GRADUATED 330 NEGROES AMONG WHOM ARE:\n
           ELLE A DELIVRE DES DIPLOMES A 330 NEGRES DONT :")

                                      #map annotations:english side
map2 <- data.frame(x=c(1, 2, 3, 4, 5),
                       y=c(2, 4, 6, 8, 12))
mapann2 <- ggplot(data = map2, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 4, color = "black", lineheight = .6,x=1,y=3,family ="mono",
           label = "PREPARED AND EXECUTED BY\n
           NEGRO STUDENTS UNDER THE\n
           DIRECTION OF
           \n ATLANTA,GA.
           \n UNITED STATES OF AMERICA.")

                                    #map annotations:french side
map3 <- data.frame(x=c(1, 2, 3, 4, 5),
                       y=c(2, 4, 6, 8, 12))
mapann3 <- ggplot(data = map3, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 4, color = "black", lineheight = .6,x=1,y=3,family ="mono",
           label = "PREPAREES ET EXECUTEES PAR\n 
           DES ETUDIANTS NEGRES SOUS\n
           LA DIRIECTION DE L'UNIVERSITE\n
           D'ATLANTA.\n.
           ETAT DE GEORGIE.\n
           ETATS UNIS D'AMERIQUE.")


                                   #making the pie chart
order <- c(2,1,3,4,5,6)
rbind(data, order)
challenge10 <- data.frame(Occupation = occupation, Percentage = percentage, Order = order)

challenge10 <- challenge10[order(-challenge10$Percentage), ]
# Create a new column for ordering
challenge10$order <- seq_len(nrow(challenge10))
challenge10
# Adjust the order so that the second highest follows the highest on the left side
challenge10$order <- ifelse(challenge10$order == 2, 1, ifelse(challenge10$order == 1, 2, challenge10$order))

# Create an ordered pie chart with the second highest following the highest on the left side
m <- ggplot(challenge10, aes(x = "", y = Percentage, fill = Occupation, label = paste0(Percentage, "%"), order = order)) +
  geom_bar(stat = "identity", width = 1) +
  coord_polar("y", start = -pi/2) +  # Set start angle to -pi/2 to start at the top
  scale_fill_manual(values = c("ivory","pink","yellow","blue","lightgreen","red")) +  # Use a color palette
  theme_void() +
  theme(legend.position = "none")+
  geom_text(color = "black", size = 4, position = position_stack(vjust = 0.5))

                                              #side legends
                                            #left side legend
dff <- data.frame(x=c(1, 2, 2, 3, 3, 4, 8, 10),
                  y=c(2, 4, 5, 4, 7, 9, 10, 12))
c <- ggplot(data = dff, aes(x, y)) +
  theme_void()+
  geom_circle(aes(x0=1, y0=11, r=0.5), fill='#FF0000')+#, inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=10, r=0.5),fill='blue')+#, inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=9, r=0.5), fill='pink')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=8, r=0.5),  fill='ivory')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=7, r=0.5), fill='lightgreen')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=1, y0=6, r=0.5),fill='yellow')+#,inherit.aes=FALSE) +
  scale_x_continuous(expand = c(0, 0), limits = c(0, 10)) +  #breaks = seq(0, 10, 1000),
  coord_fixed() +
  annotate(geom = "text", family = "mono",x = 3.5, y = 11,  size = 4, color = "black", lineheight = .6,label = "TEACHERS")+
  annotate(geom = "text", family = "mono",x = 3.5, y = 10,  size = 4, color = "black", lineheight = .6, label = "MINISTERS") +
  annotate(geom = "text", family = "mono",x = 5, y = 9,  size = 4, color = "black", lineheight = .6, label = "GOVERNMENT SERVICE") +
  annotate(geom = "text", family = "mono",x = 3.5, y = 8, size = 4, color = "black", lineheight = .6, label = "BUSINESS") + 
  annotate(geom = "text", family = "mono",x = 5, y = 7,  size = 4, color = "black", lineheight = .6, label = "OTHER PROFESSIONS") +
  annotate(geom = "text", family = "mono",x = 3.8, y = 6, size = 4, color = "black", lineheight = .6, label = "HOUSE WIVES") 


                                           #right side legend
othdff <- data.frame(x=c(1, 2, 3, 4, 5),
                  y=c(2, 4, 6, 8, 12))
d <- ggplot(data = othdff, aes(x, y)) +
  theme_void()+
  geom_circle(aes(x0=9, y0=11, r=0.5), fill='#FF0000')+#, inherit.aes=FALSE) +
  geom_circle(aes(x0=9, y0=10, r=0.5),fill='blue')+#, inherit.aes=FALSE) +
  geom_circle(aes(x0=9, y0=9, r=0.5), fill='pink')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=9, y0=8, r=0.5),  fill='ivory')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=9, y0=7, r=0.5), fill='lightgreen')+#,inherit.aes=FALSE) +
  geom_circle(aes(x0=9, y0=6, r=0.5),fill='yellow')+#,inherit.aes=FALSE) +
  scale_x_continuous(expand = c(0, 0), limits = c(0, 10))+ #+  #breaks = seq(0, 10, 1000),
coord_fixed() +
  annotate(geom = "text", family = "mono",x = 4.5, y = 11,size = 4, color = "black", lineheight = .6,label = "PROFESSEURS ET INSTITUTERS")+
  annotate(geom = "text", family = "mono",x = 4.5, y = 10,size = 4, color = "black", lineheight = .6, label = "MINISTRES DE L'EVANGLE") +
  annotate(geom = "text", family = "mono",x = 4.5, y = 9,size = 4, color = "black", lineheight = .6, label = "EMPLOYES DU GOVERNEMENT") +
  annotate(geom = "text", family = "mono",x = 4.5, y = 8,size = 4, color = "black", lineheight = .6, label = "MARCHANES") + 
  annotate(geom = "text", family = "mono",x = 4, y = 7,size = 3.5, color = "black", lineheight = .6, label = "MEDONS,ADVOVACATS,ETUDIANTS") +
  annotate(geom = "text", family = "mono",x = 4.5, y = 6,size = 4, color = "black", lineheight = .6, label = "MERES DE FAMILLE") 


anntbl <- data.frame(x=c(1, 2, 3, 4, 5),
                     y=c(2, 4, 6, 8, 12))
ann <- ggplot(data = anntbl, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 3, color = "black", lineheight = .6,x=1,y=1,family = "mono",
                 label = "THE UNIVERSITY HAS 20 PROFESSIONALS AND INSTRUCTORS AND 250 STUDENTS AT PRESENT.\n
                 IT HAS FIVE BUILDINGS, 60 ACRES OF CAMPUS, AND A LIBRARY OF 11,000 VOLUMES. IT AIMS TO RAISE\n
                 AND CIVILIZE THE SONS OF THE FREEDMEN BY TRAINING THEIR MORE CAPABALE MEMBERS IN THE LIBER-\n
                 AL ARTS ACCORDING TO THE BEST STANDARDS OF THE DAY.\n
                    THE PROPER ACCOMPLISHMENT OF THS WORK DEMANDS AN ENDOWMENT FUND OS $500,000.\n
                     L' UNIVERSITE A ACTUELLEMENT 20 PROFESSUERS ET INSTRUCTEURS ET 250 ETUDIANTS.\n
                     ELLE EST COMPOSEE DE CINC BATIMENTS, 60 ACRES(ENVIRON 26 HECTARES)DE TERRAIN SERVANT DE\n
                 COUR ET DE CHAMP DE RECREATION, ET DUNE BIBLIOTHEQUE CONTENANT 11,000 VOLUMES\n.
                    SON BUT EST O'ELEVER ET DE CIVILISER LES FILS DES NEGRES AFFRANCHIS EN DONNANT AUX MIEUX\n
                 DOUES UNE EDUCATION DANS LES ARTS LIBERAUX EN ACCORD AVEC LES IDEES LES PLUS PROGESS-\n
                 SISTES DE L'EPOQUE.\n
                 L'ACCOMPLISSEMENT DE CETTE OEUVRE DEMANDE UNE DOTATION DE $500,00 (2,500,000 FRANCIS).\n
           source:Paris Exposition Plate 37 | Graphic by: Simisani Ndaba")


                                               #caption
simi <- data.frame(x=c(1, 2, 3, 4, 5),
                     y=c(2, 4, 6, 8, 12))
simicaption <- ggplot(data = simi, aes(x, y)) +
  theme_void()+
  annotate(geom = "text",size = 8,fontface="bold", color = "black", lineheight = .6,x=1,y=3,family = "mono",
           label = "source:Paris Exposition Plate 37 | Graphic by: Simisani Ndaba")


                                       #combined plots using patchwork
titles <- plot_grid(titleTop,subtitleTop,ncol=1)   #titles annotation

top <- plot_grid(mapann2,us,mapann3,ncol=3)   #map and side annotations

topplus <- plot_grid(top,mapann1,ncol=1)     #titles and map annotations

             
bottom <- plot_grid(c,m,d,ncol=3)          #pie chart
bottomAnn <- plot_grid(bottom,ann,ncol=1)   #pie chart annotations


                                     #final plot
final_plot <- plot_grid(titles,topplus,bottomAnn,simicaption,ncol=1)

final_plot +
  theme(panel.background = element_blank(),
        plot.background = element_rect(fill = 'papayawhip'),
        plot.margin = unit(c(-0.5, -0.5, -0.5,-0.5), "cm"))


ggsave("challenge10.png",width = 13,height=15)

Challenge 10 - R code output

Challenge 10 - Original and Recreation

Other Data Viz Challenges

Resources

Get in Touch

Mastodon: @simisani

Linkedin: simisani ndaba

Github: sndaba

Check out R-Ladies Gaborone Whatsapp group

Presentation Slides

https://simi.quarto.pub/2024-du-bois-challenge/#/title-slide