This module will present concepts, architectures and algorithms for IoT big data processing and analytics, at a very large scale, in distributed settings. The following topics will be covered:
- Apache Hadoop
- Apache Spark
- Apache Flink
The module will present algorithms for data analysis and mining while focusing on mining massive datasets on real time. It will focus on both practical and theoretical aspects of data mining. During the module, the students will become familiar with the most successful algorithms for classification, clustering, and mining frequent itemsets.