The course will focus on the study of univariate, bivariate and multivariate analysis using the R language, an open source environment for data management, statistical analysis, graphing and, more generally, for the use of a variety of formal methods (Networks Analysis, Time Series Analysis, Differential Equations, Machine Learning, Multivariate Statistics, etc.).
The course covers the following:
1) basic mathematical notions and logical propedeutics to computer programming;
2) operations on vectors, matrices, factors, lists, tables, data frames, using the R language;
3) read and write operations on external files using the R language;
4) graphic representations of the data using the R language;
5) programming with R: definitions of new functions, control constructs, conditional constructs and iterative constructs (if, ifelse, for, while, break, repeat, next);
6) univariate and bivariate descriptive statistics using the R language;
7) linear correlation and regression using the R language;
8) main component analysis using the R language;
9) cluster analysis using the R language;
10) network analysis using the R language;
We start discussing about the data and knowledge and differences between those. We then move on on the relational techniques to manage large amount of data. We discuss in quite details about the data management systems and the transactions which guarantee data consistency. We also discuss about relational algebra which is the founding pillar for the information retrieval languages and in particular is the basis of all SQL based language widely used in today’s data base management systems.
Lecturer's assignments
- Optional: Introduction to Computational Social Science, Principle and Applications. Claudio Cioffi-Revilla (In inglese)
- Optional: Big data. Una rivoluzione che trasformerà il nostro modo di vivere e già minaccia la nostra libertà. Viktor Mayer-Schönberger, Kenneth N. Cukier e R. Merlini