This module will present concepts, architectures and algorithms for IoT big data processing and analytics, at a very large scale, in distributed settings. The following topics will be covered:

  • Apache Hadoop
  • Apache Spark
  • Apache Flink

The module will present algorithms for data analysis and mining while focusing on mining massive datasets on real time. It will focus on both practical and theoretical aspects of data mining. During the module, the students will become familiar with the most successful algorithms for classification, clustering, and mining frequent itemsets.

Lecture Slides

    • 1. Introduction to Big Data Slides
    • 2. Big Data Science Slides
    • 3. Real Time Big Data Management Slides
    • 4. Internet of Things Data Science Slides