Home

Designing information systems for data modeling & deployment challenges in Data Science.

Instructors: Prof. Nitin Sawhney, Dr. Barbara Keller, Sami El-Mahgary

Teaching Assistants: Etna Lindy, Long Nguyen, Trang Nguyen, Pham Binh, Sophie Truong, Ville Vuorenmaa

The lectures hosted online: Tuesdays 16:15-18:00 (20.04.2021 – 07.05.2021)

We designed and co-taught an introductory but highly intensive course on relational databases for Data Science BSc students at Aalto University. It was first offered in Spring 2021 and will be continued each year; no prior knowledge of databases assumed. The course covers the fundamentals of relational algebra, the design of the relational schema including the Unified Modeling Language (UML), functional dependency and normal forms, the concept of transactions, creating SQL tables (including indexes), and using SQL to query the database. 

Following the course, the students gain the know-how to design and implement relational databases that meet the normalization rules. Moreover, the students should be able to use SQL to write and run various types of queries so as to extract the desired data from the database, an essential part when analyzing data. In particular, the course will draw on relevant examples to prepare students to apply the principles of relational databases to projects in data science.

Students worked in teams to conduct data modeling and deployment of databases for real-world projects including a dairy farm and vaccine distribution.

Modeling the distribution of vaccines, UML model devised by a project team (Hanne Sauer, Sergey Zakuraev, Atreya Ray, Aayush Kucheria, Anselmi Jokinen), June 2021.
Modeling the distribution of vaccines, UML model devised by a project team (Hanne Sauer, Sergey Zakuraev, Atreya Ray, Aayush Kucheria, Anselmi Jokinen), June 2021.

Weekly Schedule

Week 1: Introduction to Databases, Relational Algebra and SQL

Week 2: Unified Modeling Language (UML) Part I, Functional Dependencies & Normal Forms Part I and Defining SQL Tables, Integrity Constraints and Views

Week 3: Unified Modeling Language (UML) Part II, Functional Dependencies & Normal Forms Part II and Advanced SQL and Aggregation

Week 4: Views, Indexes, Transactions and Triggers

Week 5: Data Cleaning and Data Analysis

Week 6: Project Presentations & Discussion

Week 7: Additional Exercise Sessions

Week 8: Project Package Deliverables

All sessions were hosted online (along with weekly lab tutorials led by TAs) and video recordings posted online.

Course website: https://mycourses.aalto.fi/course/view.php?id=28155