Overview
Learning Goals
Throughout the course, we have touched upon various phases of the data preparation and workflow management pipeline. In the team project, we bring together all that you have learnt.
Together with your team members, you will - set up a reproducible workflow on GitHub, - apply the end-to-end Git workflow for versioning files, and manage your project using GitHub Issues, and - build and automate a data preparation and analysis pipeline from scratch.
Working on your team project is not only a great refresher on the course content, but especially gives you a better sense of the added value of using Git/GitHub when collaborating with one another, and automation with make
with a many source code files.
For inspiration, explore past team projects like Airbnb Price Calculator, COVID-19 and Length of Stay, and Movie Genre IMDb Analysis. Use these as a starting point, keeping in mind that grading criteria may have changed and these examples are not flawless.
This project may be different from other projects that you have worked on during your study. In particular, the purpose of this project is not to write an academic research paper, but instead to focus on the infrastructure of working on an academic research paper. More more insights, check the grading guidelines.
Organization
Coaching sessions
During the course, you will have the opportunity to meet up with the course instructor for coaching sessions. These sessions are meant for you to receive feedback on your ideas and code. Frequently, this also entails problem-solving & debugging.
- Participation: All teams attend the full session. Teams typically collaborate on their projects while the course instructor provides support (in-person by walking around or via Zoom breakout rooms).
- Session Format:
- First Half: Each team gets 5–10 minutes to provide a progress update and seek assistance from the coach.
- Second Half: Time is allocated on a needs basis to address specific issues or questions raised by teams.
- Deliverables: Most coaching sessions will help the team work on some deliverables, which are always due before the next coaching session. Please refer to Canvas for exact due dates each week.
Team composition
- 4-5 students per team
- Allocation in the first course week
- Enroll your team to the template repository from GitHub classroom
Deadline & submission
- The submitted repository on Github classroom is the team project that will be graded.
- Deadline: tba
Permitted Level of AI Use
Level of AI allowed for this assignment: AI-assisted idea generation and structuring (Level 3 on AI Index Tilburg University)
- You are allowed to use generative AI tools to develop or refine initial ideas, materials, paraphrasing, structures, or outlines. This includes generating code, e.g., for R.
- Failing to declare AI use, or using AI beyond what is allowed in the syllabus, may be considered fraud and will be reported to the Examination Board.
- Mandatory for students: For this assignment, please keep a simple “logbook” documenting which AI tools you have used and for what purposes. Submit this logbook with the assignment (e.g., as a PDF). Examiners may inspect the logbook at any time. Failing to declare AI use, or using AI beyond what is allowed in the syllabus, may be considered fraud and will be reported to the Examination Board.
Gotcha! There is no report! The project should self-document itself (e.g., comments in code, makefile
), plus you’ll have an amazing README that ties everything together and motivates your project. Make this one shine! :)
Note that we will check out the state of the repository at the deadline date and time, so any changes you make to the repository afterwards are not considered for grading.