The course covers the fundamental concepts of management and design of database systems.
Topics include data models (relational); query languages (SQL); implementation techniques of database management systems (index structures and query processing); and noSQL databases.
The learning objectives are: a) To understand and use the main technologies for database management; b) To design a relational database (and not), from a conceptual, logical and physical perspective; c) To use SQL language for performing efficient queries in cases of large datasets; and d) To create and query large scale datasets.
This module covers the fundamental concepts of management and design of a business intelligence system. Topics include data models for building a data warehouse; ETL (extract, transform and load) functionalities; OLAP analysis; basic data mining; reporting and interactive dashboards, evolution of BI architectures on large datasets. The module covers techniques and algorithms for data visualization and exploratory analysis based on principles and techniques from graphic design, perceptual psychology and cognitive science. It is targeted to using visualization in their data analytics work. The learning objectives are as follows:
Knowledge and understanding
Applying knowledge and understanding
Lectures, hands-on exercises, paper reading, student presentations and seminars.
Should teaching be carried out in mixed mode or remotely, it may be necessary to introduce changes with respect to previous statements, in line with the programme planned and outlined in the syllabus.
The main teaching methods are as follows:
Should teaching be carried out in mixed mode or remotely, it may be necessary to introduce changes with respect to previous statements, in line with the programme planned and outlined in the syllabus.
Basic programming skills
Strongly recommended. Attending and actively participating in the classroom activities will contribute positively towards the overall assessment of the oral exam.
Strongly recommended. Attending and actively participating in the classroom activities will contribute positively towards the overall assessment of the oral exam.
1) Models and Languages for Database Management
Fundamentals of Database Management Systems (DBMS)
Relational Model: basic concepts, integrity constraints and keys.
SQL language: data definition, data modification, queries, views, transactions.
NO-SQL database: MongoDB
2) Querying and processing big data
Apache Spark SQL with Python
Dataset and Dataframes
Examples of data analysis with Spark SQL
1. Introduction to Business Intelligence and Big Data Analytics (6 hours)
2. Data models for data warehouse (10 hours)
3. BI Architecture (8 hours)
4. Data Visualization (16 hours)
R. Elmasri and S. Navathe, "Fundamentals of Database Systems", 7th Edition, Pearson, 2016.
B. Chambers, M. Zaharia, "Spark: the definitive guide", O'Reilly, 2018.
Instructor’s notes
Any further teaching materials will be published on the course's Studium page.
Instructor's notes will be made available on the Studium web site and/or the Microsoft Teams platform
DATA BASE | ||
Argomenti | Riferimenti testi | |
1 | Introduction to databases: Concepts and Architecture | Book 1 - Chapter 1 and 2 |
2 | Relational Data Model | Book 1 - Chapter 5 |
3 | Basic SQL: data definition, SQL query, update instruction set. | Book 1 - Chapter 6 + Notes |
4 | Advanced SQL: Complex Queries, Triggers, Views | Book 1 - Chapter 7 + Notes |
5 | Query processing and optimization | Book 1 - Chapter 18 and 19 |
6 | NOSQL Databases and Big Data Storage Systems | Book 1 - Chapter 24 + Notes |
7 | Active, Temporal, Spatial, Multimedia, and Deductive Databases | Book 1 - Chapter 26 |
8 | Getting started with Spark SQL for Data Processing | Book 2 - Chapter 1 and 2 + Notes |
9 | Spark SQL for Data Exploration | Book 2 - Chapter 3 + Notes |
10 | Spark SQL for Learning Applications | Book 2 - Chapter 6 and 10 + Notes |
11 | Multimedia benchmarks for bias identification and analysis | Research paper list on course web site |
BIG DATA ANALYTICS | ||
Argomenti | Riferimenti testi | |
1 | Introduction to Big Data Analytics. | [Notes] |
2 | Business intelligence: introduction, fundamental concepts and architectures | [Notes] [GoRi] Chap. 1 |
3 | The structure and evolution of BI and Big Data analytics systems | [Notes] |
4 | Data models for data warehouse: conceptual modeling and design | [GoRi] Chap. 2-6 |
5 | Multi-dimensional data model | [GoRi] Chap. 5 |
6 | Data models for data warehouse: logical modeling and design | [GoRi] Chap. 8-9 |
7 | ETL (extract, transform and load) process | [GoRi] Chap. 10 [Notes] |
8 | OLAP analysis and query | [GoRi] Chap. 7 [Notes] |
9 | Introduction to Data Visualization. Visual Perception and Preattentive Attributes | [Dash] Chap. 1 [Few2] Chap. 4 |
10 | Charts and standard views: relevance, appropriateness and best practices | [Few1] |
11 | Use of colors in data visualization | [Dash] Chap. 1 |
12 | Advanced and innovative tools for data visualization: the Tableau platform | [Notes] |
13 | Dashboard design principles. Exploratory vs. Explanatory dashboards. | [Few2] |
14 | Data visualization: infographics and storytelling | [Few2] |
Written exam with SQL and noSQL exercises.
Learning assessment may also be carried out on line, should the conditions require it.
The final exam consists of
Assessment criteria include: depth of analysis, adequacy, quality and correctness of the proposed solutions to the project work, ability to justify and critically evaluate the adopted solutions, clarity.
The vote on the Big Data Analytics module will account for 50% of the total grade for the entire course.
Learning assessment may also be carried out on line, should the conditions require it.
Examples of questions and exercises are available on the Studium platform and/or the Microsoft Teams platform