1 Introduction
Data are everywhere and becoming more and more important. Here I am going to describe a solid way to access, analyze, visualize and use official data statistics coming from Eurostat’s database, the official data provider of the European Commission.
1.1 Eurostat database
Eurostat is here: https://ec.europa.eu/eurostat
Basically this website provides two things: - Access to database tables and datasets - Reports from Eurostat’s staff about the data
Data and data reports are two different things. Although reading the reports is important, the main topic of this book is to provide ways to “read the data” in “our way”. Nothing tricky or weird about this, just how to read the data and perform tasks like: - Search - Filter - Link and/or join - Visualize the data in a meangifull way towards an analytical target.
Why do we need to do this? Some reasons:
- Reports are limited in scope, for example Eurostat’s staff never comment about the data (newtrality principle)
- Research reasons, we need a specific dataset to work on a scientific question, program, or just for quriosity,
- Media communication and fake news fight, unfortunately we live in a world with huge misinformation and official data statistics might clarify a lot of misinformation if properly presented.
1.3 R and tidyverse
R is statistical language and an enviroment for statistial computing. This is not a book about R per se, altougth it contains a lot of R related staff. Specifically makes heavily use of the tidyverse package a modern way to deal with data.
R can be downloaded from here R-project. For efficient writting of R programs and scripts the Rstudio IDE is highly reccomended.