Sniffing Out the Strange — A Look at Anomaly Detection | Øredev 2018

🎤 Sniffing Out the Strange — A Look at Anomaly Detection

👤 Ashic Mahtab

Clustering is a form of unsupervised machine learning that attempts to group together similar things. It has many applications, one of which is finding anomalies in unlabelled data. One of the practical applications of this is identifying fraudulent behaviour. The most common algorithm for clustering is K-means. However, K-means only works with regularly shaped clusters, computation can be expensive for large datasets, and it is quite sensitive to outliers. Other algorithms for clustering exist, such as the BFR algorithm and the CURE algorithm that can tackle some of these problems. In this session, we will start off by looking at K-means clustering, and then we will see how BFR and CURE can solve some of its problems. We will also discuss how these techniques helped in implementing a streaming fraud detection solution for an FX client. We will see how we went from simple query based processing to complex models that led to better insight, which in turn led to a simpler system.

This page was generated from this YAML file. Found a typo, want to add some data? Just edit it on GitHub.