Automating & Elevating Assessment Analysis & Reporting with R/ggplot

Sample report

Author
Affiliation

Scott Moore

Furman University Center for Innovative Leadership

Published

December 17, 2024

1 Introduction & Motivation

1.1 Current pain points

Manual Excel workflows
  • Repetitive
  • Prone to errors
  • Difficult to trace
Time-consuming
  • Much of the effort goes into cleaning and preparing data, leaving little time for deeper analysis.
Redundancy
  • The same graphs and analyses must be recreated every time new data arrives.

1.2 Benefits of R-based process

Automation
  • ETL (Extract, Transfer, Load) processes that save time and reduce errors.
Flexibility
  • Handle large datasets
  • Generate exploratory visuals quickly
  • Produce polished reports
Transparency
  • A documented, repeatable process that ensures consistency and accountability.

1.3 Something to think about

Imagine spending less time wrestling with spreadsheets and more time delivering insights that drive real decisions.

1.4 Benefits of ggplot graphics

Flexibility

ggplot allows you to create a wide variety of plots (e.g., faceted plots, histograms, boxplots, heatmaps) beyond Excel’s standard offerings, and it’s easy to customize virtually every aspect of the plot.

Data Transparency

With ggplot, you define each part of the visualization explicitly in code, making the process transparent, reproducible, and auditable, unlike Excel, where chart creation involves manual steps.

Reproducibility

Once a ggplot script is created, it can be reused with new data effortlessly, while Excel requires redoing many manual steps every time data changes.

Automation

ggplot integrates with R, allowing automated data manipulation, visualization, and report generation (e.g., within scripts or Quarto documents). Excel relies on more manual input for generating charts, which is time-consuming and prone to errors.

Aesthetic Control

ggplot offers detailed aesthetic control over themes, colors, and styling, ensuring professional-quality visualizations. Excel’s design options, while functional, are more limited and harder to fine-tune.

Faceting and Layering

ggplot excels at creating faceted charts (multiple plots based on subsets of the data) and layering multiple data visualizations in one plot, something Excel cannot do easily.

Scalability

ggplot handles larger datasets more efficiently, whereas Excel can slow down or crash with large amounts of data or complex charts.

Integration with Data Workflow

ggplot integrates seamlessly into the broader data workflow in R (ETL, analysis, reporting), eliminating the need for separate tools or manual data exports to Excel for charting.

Advanced Customization

ggplot supports advanced customizations like custom labels, annotations, and interactions between chart components, offering far more precision than Excel.

Non-Linear Relationships and Statistical Graphics

ggplot can easily handle and visualize non-linear relationships, model fits, and statistical summaries (e.g., regression lines, confidence intervals), which is far more cumbersome in Excel.

2 Demo #1: Quick graphs for a Survey

2.1 Survey

  • Using fake survey data that I created
  • The process (all handled with a Quarto script)
    • Import the data
    • Transform the data
    • Create some graphs

2.2 Transform the data

  • Pre-written script: manipulate-survey.qmd
  • Import the resulting CSV file
  • Size of the imported table
survey |> dim()
[1] 264071     16

2.3 Bar graph

2.4 Do two calculations

  • Calculate (and save to surveyQRN) the number of each Response to each Question
surveyQRN <-
  survey |> 
    group_by(Question, Response) |> 
    summarize(Count = n()) |> 
    select(Question, Response, Count)
  • Calculate (and save to surveyQAvg the average response to each question)
surveyQAvg <-
  survey |> 
    group_by(Question) |> 
    summarize(Avg = mean(NumResp)) |> 
    select(Question, Avg)

2.5 Faceted bar

2.6 The graph-building process

  • Gather data: get the data in the right form
  • Build the easel: define how variables will be represented
  • Paint: define the kind of graph
  • Construct the frame: define coordinates & axes
  • Refine: the overall look (colors, fonts, etc.)

2.7 Defining a graph

# The structure of an R/tidyverse ggplot specification
dataframename |>
  ggplot(aes(X)) +
    facet_Z(column-info) +
    geom_Y(optional-stuff) +
    labs(...) +
    scale_x_continuous/discrete(...) +
    scale_y_continuous/discrete(...)
    theme_A() +
    scale_fill/color_B(specification)
1
Gather data
2
Build the easel
3
Paint
4
Construct the frame
5
Refine

2.8 Stacked bar

3 Demo #2: Quick graphs for Student Information

3.1 Box plot

3.2 Faceted bar graph

3.3 Box plot with Jitter plot

3.4 Scatter plot with Regression

4 Demo #3: Beautiful graphs

4.1 Detailed distribution of grades

4.2 Hex plot

4.3 Stacked bar

4.4 Labelled bar graph

4.5 Exporting a graph

ggsave("avgresp.png")

4.6 Creating a formatted report

asdf

5 Summary

5.1 Benefits of this process

  • Efficiency: Once the process is set up, you can rerun it anytime with updated data, drastically reducing the time spent on routine reporting.
  • Fewer Errors: Automating data cleaning and transformations ensures consistent, accurate results, eliminating the risks of human error in Excel formulas.
  • Reproducibility & Transparency: Every step is documented, so it can be reviewed, audited, and easily modified over time.
  • Future-Proofing: R scripts can evolve with your needs. For example, when new data sources are added or new questions need to be asked, you can adapt the process without starting over.

5.2 Call To Action

  • Start Small: Try R with one report or dataset. Use it as to demonstrate time savings and improvements in quality.
  • Resources Are Available: R, ggplot, and Quarto are open-source and free. Essentially risk-free to try.
  • Show Success: When piloting, track time saved and improvements in workflow. Share these results with leadership to justify broader adoption.
  • Support for Adoption: Communities of Practice exist (ThIRsdays); resources exist (rforir.com); courses exist (see me).

5.3 Closing thought

The (free) tools are out there, waiting to make your work faster, more transparent, and more impactful. Take the first step, and soon you’ll wonder how you managed without them.