COVID-19 Vaccination Dashboard in R

The COVID-19 pandemic has led to a global vaccination effort of unprecedented scale. Monitoring vaccine distribution and administration is essential for assessing progress and identifying areas for improvement. In this article, we will explore how to build a COVID-19 Vaccination Tracker Dashboard using R, allowing stakeholders to visualize and analyze vaccination data in R Programming Language.

Dataset Overview

The dataset used in this analysis contains information on COVID-19 vaccine distribution and administration. It includes attributes such as date, location (e.g., country, region), vaccine manufacturer, doses distributed, doses administered, and population demographics. Each record represents a daily snapshot of vaccine-related activities in a specific geographic area.

Dataset Link: COVID-19 Vaccination

The dataset consists of COVID-19-related metrics for different countries or regions. Here’s an introduction to each column:

  • Country.Region: The name of the country or region.
  • Confirmed: The total number of confirmed COVID-19 cases.
  • Deaths: The total number of deaths attributed to COVID-19.
  • Recovered: The total number of individuals who have recovered from COVID-19.
  • Active: The number of active COVID-19 cases (Confirmed – Deaths – Recovered).
  • New.cases: The number of new confirmed COVID-19 cases reported.
  • New.deaths: The number of new deaths attributed to COVID-19 reported.
  • New.recovered: The number of new recoveries reported.
  • Deaths…100.Cases: The percentage of deaths among confirmed cases.
  • Recovered…100.Cases: The percentage of recoveries among confirmed cases.
  • Deaths…100.Recovered: The percentage of deaths among recovered cases.
  • Confirmed.last.week: The total number of confirmed cases reported in the previous week.
  • X1.week.change: The change in confirmed cases compared to the previous week.
  • X1.week…increase: Indicates whether there was an increase in confirmed cases compared to the previous week.
  • WHO.Region: The World Health Organization (WHO) region to which the country or region belongs.

Visualization of COVID-19 Data

Now we visualize the unique insights into COVID-19 trends, helping stakeholders better understand the pandemic’s impact and inform decision-making processes.

Load the dataset and the required packages

Before delving into the visualizations, let’s load the necessary packages and the dataset containing the COVID-19 metrics:

R
# Load required packages
library(ggplot2)   # For creating visualizations
library(dplyr)     # For data manipulation

# Load the dataset
data <- read.csv("covid_data.csv")# Assuming the data is in a CSV file 
head(data)

Output:

       Country.Region Confirmed Deaths Recovered Active New.cases New.deaths
1         Afghanistan     36263   1269     25198   9796       106         10
2             Albania      4880    144      2745   1991       117          6
3             Algeria     27973   1163     18837   7973       616          8
4             Andorra       907     52       803     52        10          0
5              Angola       950     41       242    667        18          1
6 Antigua and Barbuda        86      3        65     18         4          0
  New.recovered Deaths...100.Cases Recovered...100.Cases Deaths...100.Recovered
1            18               3.50                 69.49                   5.04
2            63               2.95                 56.25                   5.25
3           749               4.16                 67.34                   6.17
4             0               5.73                 88.53                   6.48
5             0               4.32                 25.47                  16.94
6             5               3.49                 75.58                   4.62
  Confirmed.last.week X1.week.change X1.week...increase            WHO.Region
1               35526            737               2.07 Eastern Mediterranean
2                4171            709              17.00                Europe
3               23691           4282              18.07                Africa
4                 884             23               2.60                Europe
5                 749            201              26.84                Africa
6                  76             10              13.16              Americas

Trend of New Cases Over Time

The trend of new COVID-19 cases over time is a critical indicator of the progression of the pandemic and the effectiveness of containment measures.

R
ggplot(data, aes(x = 1:nrow(data), y = New.cases, group = 1)) +
  geom_line(color = "blue") +
  labs(title = "Trend of New Cases Over Time", x = "Days", y = "New Cases")

Output:

COVID-19 Vaccination Dashboard in R

The resulting visualization is a line chart that illustrates the trend of new COVID-19 cases over time. The x-axis represents the dates, while the y-axis represents the number of new cases reported each day. The line in the chart depicts the trajectory of new cases, allowing viewers to observe patterns such as spikes, trends, or fluctuations over time.

Scatter Plot for Deaths vs. Recovered

The comparison between deaths and recoveries is crucial in assessing the severity and impact of the COVID-19 pandemic.

R
ggplot(data, aes(x = Deaths, y = Recovered, color = WHO.Region)) +
  geom_point() +
  labs(title = "Deaths vs. Recovered", x = "Deaths", y = "Recovered")

Output:

COVID-19 Vaccination Dashboard in R

The resulting visualization is a scatter plot that illustrates the relationship between deaths and recovered cases due to COVID-19. Each point on the plot represents a country or region, with its position indicating the number of deaths and recovered cases. The x-axis represents the total number of deaths, while the y-axis represents the total number of recovered cases.

Deaths per 100 Cases Analyzing COVID-19 Mortality Rates

The metric “Deaths per 100 Cases” provides a normalized measure of the mortality rate of COVID-19 across different countries or regions. This measure helps in comparing the severity of the pandemic in various areas, regardless of the total number of cases.

R
# Create the box plot
ggplot(data, aes(x = WHO.Region, y = Deaths...100.Cases, fill = WHO.Region)) +
  geom_boxplot() +
  labs(title = "Deaths per 100 Cases by WHO Region", x = "WHO Region", 
       y = "Deaths per 100 Cases") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Output:

COVID-19 Vaccination Dashboard in R

The resulting visualization is a box plot that shows the distribution of “Deaths per 100 Cases” across different WHO regions. Each box represents the interquartile range (IQR) of the data, with the line inside the box indicating the median value. The whiskers extend to the minimum and maximum values within 1.5 times the IQR, and points outside this range are considered outliers.

Recovered Cases Over Time

An area chart is an excellent way to visualize cumulative data over time, such as the number of recovered COVID-19 cases. It shows how the total number of recoveries has changed, providing insight into the effectiveness of treatment and recovery efforts over a period.

R
# Assuming the dataset does not contain a Date column, create a Date column based on row numbers
data$Date <- seq(as.Date("2020-01-01"), by = "day", length.out = nrow(data))

# Create the area chart
ggplot(data, aes(x = Date, y = cumsum(Recovered))) +
  geom_area(fill = "lightblue", color = "blue") +
  labs(title = "Cumulative Recovered Cases Over Time", x = "Date", 
       y = "Cumulative Recovered Cases") +
  theme_minimal()

Output:

COVID-19 Vaccination Dashboard in R

The resulting visualization is an area chart that shows the cumulative number of recovered COVID-19 cases over time. The x-axis represents the dates, while the y-axis represents the cumulative number of recovered cases. The area under the curve represents the total number of recoveries up to each date.

COVID-19 Vaccination Tracker Dashboard

Creating a dashboard to visualize various aspects of the COVID-19 data, including trends of new cases, deaths vs. recovered, and deaths per 100 cases, provides a comprehensive overview of the pandemic’s impact. We’ll use R and the shiny package to create an interactive dashboard.

R
library(shiny)
library(ggplot2)
library(dplyr)

# Load the dataset
data <- read.csv("covid_data.csv")

# Assuming the dataset does not contain a Date column
data$Date <- seq(as.Date("2020-01-01"), by = "day", length.out = nrow(data))

# UI for the dashboard
ui <- fluidPage(
  titlePanel("COVID-19 Dashboard"),
  sidebarLayout(
    sidebarPanel(
      selectInput("region", "Select WHO Region:", choices = unique(data$WHO.Region), 
                  selected = "Global")
    ),
    mainPanel(
      tabsetPanel(
        tabPanel("Trend of New Cases Over Time", plotOutput("newCasesPlot")),
        tabPanel("Deaths vs. Recovered", plotOutput("deathsRecoveredPlot")),
        tabPanel("Deaths per 100 Cases by WHO Region", 
                 plotOutput("deathsPer100CasesPlot")),
        tabPanel("Cumulative Recovered Cases Over Time", 
                 plotOutput("cumulativeRecoveredPlot"))
      )
    )
  )
)

# Server logic for the dashboard
server <- function(input, output) {
  
  # Filter data based on selected region
  filteredData <- reactive({
    if (input$region == "Global") {
      data
    } else {
      data %>% filter(WHO.Region == input$region)
    }
  })
  
  # Plot for Trend of New Cases Over Time
  output$newCasesPlot <- renderPlot({
    ggplot(filteredData(), aes(x = Date, y = New.cases)) +
      geom_line(color = "blue") +
      labs(title = "Trend of New Cases Over Time", x = "Date", y = "New Cases") +
      theme_minimal()
  })
  
  # Plot for Deaths vs. Recovered
  output$deathsRecoveredPlot <- renderPlot({
    ggplot(filteredData(), aes(x = Deaths, y = Recovered, color = WHO.Region)) +
      geom_point() +
      labs(title = "Deaths vs. Recovered", x = "Deaths", y = "Recovered") +
      theme_minimal()
  })
  
  # Plot for Deaths per 100 Cases by WHO Region
  output$deathsPer100CasesPlot <- renderPlot({
    ggplot(filteredData(), aes(x = WHO.Region, y = Deaths...100.Cases, fill = WHO.Region)) +
      geom_boxplot() +
      labs(title = "Deaths per 100 Cases by WHO Region", x = "WHO Region", 
           y = "Deaths per 100 Cases") +
      theme_minimal() +
      theme(axis.text.x = element_text(angle = 45, hjust = 1))
  })
  
  # Plot for Cumulative Recovered Cases Over Time
  output$cumulativeRecoveredPlot <- renderPlot({
    ggplot(filteredData(), aes(x = Date, y = cumsum(Recovered))) +
      geom_area(fill = "lightblue", color = "blue") +
      labs(title = "Cumulative Recovered Cases Over Time", x = "Date", 
           y = "Cumulative Recovered Cases") +
      theme_minimal()
  })
}

# Run the application 
shinyApp(ui = ui, server = server)

Output:

COVID-19 Vaccination Dashboard in R

The titlePanel sets the title of the dashboard.

  • The sidebarPanel contains a selectInput for choosing a WHO region.
  • The mainPanel contains a tabsetPanel with four tabs, each displaying a different plot.
  • filteredData() reactive function filters the dataset based on the selected WHO region.

Four renderPlot functions generate plots for each tab: the trend of new cases over time, deaths vs. recovered, deaths per 100 cases, and cumulative recovered cases over time.

  • Trend of New Cases Over Time: Displays the trend of new COVID-19 cases using a line chart.
  • Deaths vs. Recovered: Shows a scatter plot of deaths versus recovered cases.
  • Deaths per 100 Cases by WHO Region: Visualizes the distribution of deaths per 100 cases using a box plot.
  • Cumulative Recovered Cases Over Time: Illustrates the cumulative number of recovered cases over time using an area chart.

Save the above code in an app.R file and run it using the shiny package to launch the interactive dashboard. This dashboard allows users to select different WHO regions and explore the respective visualizations, providing a comprehensive view of the COVID-19 pandemic’s impact.