Data preparation and Workflow Management (dPrep)
Engineer data sets from complex raw data and manage research projects efficiently
Tilburg University, Block 3, 2020/2021 (February - April 2021)
Instructor: dr. Hannes Datta
Learn how to prepare data for empirical analysis
Welcome to the course website of dPrep. This course teaches you how to engineer data sets for statistical analysis. Many students and researchers perceive the process of “creating” a data set for analysis as rather simplistic: a bit of cleaning here, a bit of merging there, and you’re done. In this course, we take data preparation to the next level, by considering highly complex data preparation workflows (think multiple sources, structured and unstructured data, data from databases and data from files, multiple delivery batches, lots of missing data, different file versions, etc.). Throughout the course, we’ll be using workflow principles of reproducible science that are documented at Tilburg Science Hub.
This website is the backbone of the course, and features the following main sections.
The course section holds a list of weekly modules, consisting of live streams, readings, and prerecorded clips. Even if you’re not enrolled in this course, you can watch these clips, but interaction with the course instructor is reserved for enrolled students only.
The tutorial and data challenge section offers self-guided tutorials that teach the principles of data preparation and workflow management. It also holds a (weekly) data challenge in which you can put your skills into practice!
Finally, this course uses building blocks and examples from Tilburg Science Hub, a platform to help students, researchers and data scientists to work together efficiently.
Enroll in this class
Head over to the course syllabus for all the details. The course offers seats to Research Master and PhD students from outside of Tilburg University.
Want to pursue a PhD at Tilburg University? Check out the Research Master in Business - Marketing track.