Course Syllabus


Sirui Wang



Office Hours:

Send me an email if you got questions, and we can set up a time!


Class Time:

MWF 10:30AM – 12:00PM ET from August 6 through August 25, 2021


Session Format:

Classes will be held over Zoom. The link to each of the sessions will be posted in the "Zoom" tab on the menu.

Each session will focus on either some topic related to empirical research or a technical tool that you may find helpful in your doctoral studies.

I will dedicate the first half of each class to introducing a topic or a demonstration, and the remainder of the time will be left for you all to try it out for yourself with some relevant exercises. 


Course Description:

The goal of this course is to familiarize incoming and current Wharton Ph.D. students with the basic technical skills and tools required for empirical research. We will start with some basics of the R programming language and move on to survey some of the most likely methods you may encounter in empirical research. We focus on R since it is favored in most academic settings over other free computing tools, and you will most likely be using it in any statistics course you take at Wharton. The methods we will cover are not specific to R, and can be implemented through other software (e.g., STATA, SAS, Python, etc.), and depending on the context, some may be better suited than others. The topics we will cover include the basics of R programming, data wrangling and visualization, data scraping, regressions and causal inference, text analysis and basic predictive machine learning. At the end of this short-term course, students will have a better understanding of what tools are most appropriate for different data analysis settings they may encounter in their research.



There is no prerequisite for this course. Students with no programming experience will likely benefit the most, but students with prior programming experience can also learn about Wharton-specific resources that can help with their research. Feel free to attend the sessions selectively. There is no exam.


Course Topics:

  1. Introduction to R programming
  2. Data wrangling and visualization
  3. Data collection and scraping
  4. Regression and methods for causal inference
  5. Text analysis
  6. Predictive machine learning
  7. Wharton HPCC and Behavioral Lab