Invited Talk at Asian Conference on Machine Learning (ACML) 2016 

Massive Online Analytics for the Internet of Things (IoT)

I was very happy to be invited to give an invited talk at the 8th Asian Conference on Machine Learning (ACML 2016) in Hamilton, New Zealand. The talk was on Massive Online Analytics for the Internet of Things (IoT). The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one […]

IoT Big Data Stream Mining Tutorial at KDD 2016 

IoT Big Data Stream Mining Tutorial

We presented this week our tutorial “IoT Big Data Stream Mining” at KDD 2016 in San Francisco. This tutorial was a gentle introduction to mining IoT big data streams. The first part introduces data stream learners for classification, regression, clustering, and frequent pattern mining. The second part deals with scalability issues inherent in IoT applications, […]

Short Course in Data Stream Mining 

Slides of a short course in Data Stream Mining, presenting classification methods, adaptive change detection, clustering and frequent pattern mining.

Keynote Talk at Business Applications of Social Network Analysis (BASNA) 2014 

Basna

I was happy to be invited to give a keynote talk at BASNA 2014, the 5th International Workshop on Business Applications of Social Network Analysis, that was co-located with the 2014 IEEE International Conference on Data Mining (ICDM 2014) in Shenzhen, China. The talk was on Real-Time Big Data Stream Analytics, about new techniques in […]

Big Data Stream Mining Tutorial at IEEE Big Data 2014 

BigDataStreamTutorial

Gianmarco de Francisci Morales presented this week our tutorial “Big Data Stream Mining” at IEEE Big Data 2014 in Washington DC. This tutorial was a gentle introduction to mining big data streams. The first part introduced data stream learners for classification, regression, clustering, and frequent pattern mining. The second part discussed data stream mining on […]

Extreme Classification: Classify Wikipedia documents into one of 325,056 categories 

Extreme classification, where one needs to deal with multi-class and multi-label problems involving a very large number of categories, has opened up a new research frontier in machine learning. Many challenging applications, such as photo and video annotation and web page categorization, can benefit from being formulated as supervised learning tasks with millions, or even […]

Evolving Data Stream Classification and the Illusion of Progress 

slide-pitfalls

Data is being generated in real-time in increasing quantities and the distribution generating this data may be changing and evolving. In a paper presented at ECML-PKDD 2013 titled “Pitfalls in benchmarking data stream classification and how to avoid them“, we show that classifying data streams has an important temporal component, which we are currently not […]

Resources to learn Big Data Analytics 

Data Stream Mining Book

A list of books and resources that are available online for learning Data Science: Industry Mc Kinsey Big data: The next frontier for innovation, competition, and productivity Website O’Reilly Big Data Now: 2012 Edition. Website IBM Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. Website Pentaho Real-Time Big Data Analytics: Emerging Architecture. […]

Mining Big Data in Real Time 

Mining Big Data in Real Time

Streaming data analysis in real time is becoming the fastest and most efficient way to obtain useful knowledge from what is happening now, allowing organizations to react quickly when problems appear or to detect new trends helping to improve their performance. Evolving data streams are contributing to the growth of data created over the last […]

Big Data Mining SIGKDD Explorations 

SigKDD Explorations

For the Big Data Mining SIGKDD Explorations Dec 2012, we selected four contributions that together show very significant state-of-the-art research in Big Data Mining, and that provide a broad overview of the field and a forecast to the future. Scaling Big Data Mining Infrastructure: The Twitter Experience by Jimmy Lin and Dmitriy Ryaboy (Twitter, Inc.). […]