# TEORIE, MODELLI E TECNICHE INFORMATICHE E DI ANALISI DEI DATI

6 CFU - 2° Semester

### Teaching Staff

CESARE GAROFALO - Module Monovariate and multivariate analysis - SPS/07 - 3 CFU
GIOVANNI GIUFFRIDA - Module INTRODUZIONE AL DATA MINING - INF/01 - 3 CFU

## Learning Objectives

• Monovariate and multivariate analysis
The objectives of the module are to lead the student: -at the knowledge of the R language and to self-learn its potential through the use of the resources available on the internet. -at the knowledge of some topics of univariate and multivariate statistics through applications with the R language.
• INTRODUZIONE AL DATA MINING
Give an overview of the techniques for data management and data discovery (data/text mining) for the automatic pattern discovery. In recent years such techniques are being heavily adopted given the huge quantities of data being collected as we act in our digital life (reading online news, social media, ecommerce, etc.) Sophisticated data analysis techniques allow us to discover meaningful patterns for social scientists.

## Detailed Course Content

• Monovariate and multivariate analysis

The course includes a discussion of:

-vectors, matrices, factors, lists, tables, data frame, and the operations on these objects in R;

-read from and write to external files in R;

-graphical representations of the data in R;

-programming with R: definitions of new functions, control constructs, conditionals and looping constructs

-univariate and bivariate statistics with R.

-correlation and linear regression with R.

-principal component analysis with R.

-cluster analysis with R.

• INTRODUZIONE AL DATA MINING
Data and information. Overview of data base management systems. Relational model. Introduction to big data. Notion of algorithms for data discovery: classification trees, clustering, rule discovery. Notions of text mining and sentiment analysis.

## Textbook Information

• Monovariate and multivariate analysis

adopted text

1. Franco Crivellari - Analisi statistica dei dati con R. Casa Editrice Apogeo
2. Michael J. Crawley - The R Book, 2nd Edition. Casa Edistrice Wiley

recommended readings:

1. Brian Everitt, Torsten Hothorn - An Introduction to Applied Multivariate Analysis with R. Springer. 2011
2. Yanchang Zhao, Yonghua Cen - Data Mining Applications With R. Academic Press. 2013
3. Espa G., Micciolo R. - Problemi ed esperimenti di statistica con R. Apogeo, 2008.
4. Iacus S., Masarotto G. - Laboratorio di statistica con R. McGraw Hill Companies, 2007.
5. Paganoni A., Ieva F., Vitelli V. - Laboratorio di statistica con R. Eserciziario. Pearson, 2012.
6. Matloff N. - The Art of R Programming. No Starch Press, 2011.
7. Torgo L. - Data Mining with R. Learning with Case Studies. Chapman & Hall/CRC, 2011.

Open in PDF format Versione in italiano