Course Syllabus

Wharton Ph.D. Tech Camp 2020



Xiaoning (Gavin) Wang

Office: JMHH 533.4



Office Hours:

By appointment


Class Time:

MWF 10:30AM – 12:00PM from August 7 through August 26, 2020

Our first class will meet on Friday, August 7


Session Format:

Each session includes two parts: theory and lab. Theory part discusses the technical theory. Lab part records the coding and results (in R Studio). Sessions will be recorded to help students review the knowledge.


Course Description:

The aim of this course is to familiarize incoming and current Wharton Ph.D. students with the basic technical skills and tools required for empirical research. Based on my personal experience, most of the graduate-level classes at Penn are taught using R as the programming language, so this tech camp will also use R. The topics include the most fundamental applications of R and R Studio, data cleaning and visualization, causal inference, natural language processing and machine learning. At the end of this short-term course, students will have a better understanding of what tools are most appropriate for different data analysis tasks at hand, and how they fit into your research pipeline.



There is no prerequisite for this course. Students with no programming experience will likely benefit the most, but students with prior experience will also learn how their programming skills fit into a larger empirical research pipeline. Feel free to attend the sessions selectively. Auditing is welcome. The format will be roughly a 60-min lecture followed by a 30-min lab session, where you are encouraged to work on exercises. There is no exam.


Course Topics:

  1. Introduction and R basics
  2. Data wrangling (dplyr, data.table)
  3. Data Visualization (ggplot2, wordcloud, maps)
  4. Regression and Causal Inference (Simple Regression, Instrumental Variable, Panel Data Method, Matching, GMM)
  5. Text Mining and Natural Language Processing (Regular Expression, Bag of Words)
  6. Machine Learning (Classification, Clustering)
  7. Wharton HPCC and Behavioral Lab