Stanford Data Science / Infoseminar
Winter 2015

Anomaly Detection with Applications in Security and Sensor Networks

Thomas G. Dietterich, Oregon State University


Our team at Oregon State has developed several new algorithms for anomaly detection. These are based on two main principles: "anomaly detection by underfitting" and "anomaly detection by overfitting". In the underfitting approach, a model is fit to the data and points that do not fit well (e.g., that have low estimated probability density) are flagged as anomalies. In the overfitting approach, we transform the data to create learning problems in which there should be no signal or pattern and then apply machine learning algorithms to fit this data. If the algorithm finds a pattern, this is due to overfitting, and points belonging to the (false) pattern are likely to be anomalies. This talk will present these algorithms and also describe our benchmarking framework, which allows us to measure and compare the performance of different anomaly detection algorithms. I will also describe two applications. The first is a security experiment conducted under the DARPA ADAMS program. The second is an application to data cleaning in sensor networks.


Thomas G. Dietterich (AB Oberlin College 1977; M.S. University of Illinois 1979; Ph.D. Stanford University 1984) is one of the founders of the field of Machine Learning. Among his research contributions was the application of error-correcting output coding to multiclass classification, the formalization of the multiple-instance problem, the MAXQ framework for hierarchical reinforcement learning, and the development of methods for integrating non-parametric regression trees into probabilistic graphical models (including conditional random fields and latent variable models). Among his writings are Chapter XIV (Learning and Inductive Inference) of the Handbook of Artificial Intelligence, the book Readings in Machine Learning (co-edited with Jude Shavlik), and his frequently-cited review articles Machine Learning Research: Four Current Directions and Ensemble Methods in Machine Learning.

He served as Executive Editor of Machine Learning (1992-98) and helped co-found the Journal of Machine Learning Research. He is currently the editor of the MIT Press series on Adaptive Computation and Machine Learning. He also served as co-editor of the Morgan-Claypool Synthesis Series on Artificial Intelligence and Machine Learning. He has organized several conferences and workshops including serving as Technical Program Co-Chair of the National Conference on Artificial Intelligence (AAAI-90), Technical Program Chair of the Neural Information Processing Systems (NIPS-2000) and General Chair of NIPS-2001 He is a Fellow of the ACM, AAAI, and AAAS. He served as founding President of the International Machine Learning Society, and he is currently a member of the Steering Committee of the Asian Conference on Machine Learning.