SISMID 2026 · Pre-session materials 2
Data Science in Infectious Disease Modeling using R 2
Please complete all materials below before your first synchronous session.
The materials on this page should be reviewed before your live lab sessions on June 22 and June 23. Work through the videos, readings, and problem sets at your own pace.
Pre-session materials — Page 2
Pre-Course Work 2: Advanced Data Wrangling
Welcome to Advanced Data Wrangling!
Please take the pre-assessment poll prior to watching each video. Videos should be watched in order from 1–10.
Post-assessment/coding activity for each module: ⬇ DSR_adv_data_wrangling_post_assessment_coding_exercises.pdf
Module 2.1 — Overview: Tidyverse, dplyr, and pipes
Module 2.2 — Tidyverse cheat sheets: a primer
Module 2.3 — Wide vs. long data
Module 2.4 — Reshaping data with pivots
Module 2.5 — Advanced data reshaping
Module 2.6 — Combining data with joins
Module 2.7 — Complex data joining
Module 2.8 — Strings
Module 2.9 — Categorical data
Module 2.10 — Regular expressions
Pre-Course Work 3: Special Considerations for Public Health Data
Lab 3.1 — Loading Data From an API
Post-Video Exercises
- Request your US census API key at api.census.gov/data/key_signup.html — use an email address you will be able to access during the synchronous sessions.
Additional API resources:
R packages:
Lab 3.2 — Working with PII Data
Additional PII resources
R packages:
- Encryptr — encrypt files/columns with RSA key
- Pii — detect PII
- sanityzeR — detect and redact PII
- anonymizer — detect and anonymize
Pre-Course Work 4: Advanced Methods
Lab 4.1 — Troubleshooting Functions Using Conditions
Lab 4.2 — Limits of R
Lab 4.3 — Parallel Programming
Lab 4.4 — Does It Function?
You’ve completed the asynchronous portion of the course!
We want your feedback on this course! Please share your thoughts.
← Pre-session materials — Page 1
Return to Overview and Best Practices