Course in collaboration with Sopra Steria
This course aims to introduce the use of Big Data systems for processing data at scale on top of distributed architectures (e.g., cluster- and cloud-based architectures).
Using the Big Data life cycle as reference (i.e., data acquisition, storage, preparation, analysis, and visualization phases), the course introduces the fundamental concepts at the core of the existing Big Data stack and shows their practical application with concrete hands-on examples.
At the end of this course, students will be capable of:
Define and illustrate with concrete examples the characteristics of Big Data (i.e., volume, velocity, and variety).
Understand and configure the main components of a Big Data platform for analytical operations.
Analyse large and heterogeneous datasets (structured, non-structured) on batch and stream.
The course follows the principle of blended learning. Students are expected to read and prepare the courses using the provided online material before each lesson.
50%
Exam50%
Datathon