Short Course in Data Stream Mining 

Slides of a short course in Data Stream Mining, presenting classification methods, adaptive change detection, clustering and frequent pattern mining.

Keynote Talk at Business Applications of Social Network Analysis (BASNA) 2014 

Basna

I was happy to be invited to give a keynote talk at BASNA 2014, the 5th International Workshop on Business Applications of Social Network Analysis, that was co-located with the 2014 IEEE International Conference on Data Mining (ICDM 2014) in Shenzhen, China. The talk was on Real-Time Big Data Stream Analytics, about new techniques in […]

Big Data Stream Mining Tutorial at IEEE Big Data 2014 

BigDataStreamTutorial

Gianmarco de Francisci Morales presented this week our tutorial “Big Data Stream Mining” at IEEE Big Data 2014 in Washington DC. This tutorial was a gentle introduction to mining big data streams. The first part introduced data stream learners for classification, regression, clustering, and frequent pattern mining. The second part discussed data stream mining on […]

Extreme Classification: Classify Wikipedia documents into one of 325,056 categories 

Extreme classification, where one needs to deal with multi-class and multi-label problems involving a very large number of categories, has opened up a new research frontier in machine learning. Many challenging applications, such as photo and video annotation and web page categorization, can benefit from being formulated as supervised learning tasks with millions, or even […]

Evolving Data Stream Classification and the Illusion of Progress 

slide-pitfalls

Data is being generated in real-time in increasing quantities and the distribution generating this data may be changing and evolving. In a paper presented at ECML-PKDD 2013 titled “Pitfalls in benchmarking data stream classification and how to avoid them“, we show that classifying data streams has an important temporal component, which we are currently not […]

Mining Big Data in Real Time 

Mining Big Data in Real Time

Streaming data analysis in real time is becoming the fastest and most efficient way to obtain useful knowledge from what is happening now, allowing organizations to react quickly when problems appear or to detect new trends helping to improve their performance. Evolving data streams are contributing to the growth of data created over the last […]