A collection of anomaly detection algorithms and relevant datasets with a common interface that allows for fast and easy prototyping and benchmarking.
The simplest way to install the package is to use pip
. By cloning the
repository:
$ git colne https://github.com/pbudzyns/ad_toolkit.git
$ cd ad_toolkit
$ pip install .
or directly using pip
:
$ pip install git+https://github.com/pbudzyns/ad_toolkit.git
donut
model requires extra (deprecated) dependencies and works only with older
versions of python
that supports tensorflow <= 1.15
. To install extra
dependencies use
$ pip install .[donut]
or
$ pip install "ad_toolkit[donut] @ git+https://github.com/pbudzyns/ad_toolkit.git"
from ad_toolkit.datasets import NabDataset
nab = NabDataset()
nab.plot()
from ad_toolkit.detectors import AutoEncoder
x_train, _ = nab.get_train_samples()
model = AutoEncoder(window_size=100, layers=(64,32,16), latent_size=8)
model.train(x_train, epochs=20, learning_rate=1e-4)
x, y = nab.get_test_samples()
scores = model.predict(x)
import numpy as np
from ad_toolkit.evaluation import Result
labels = (scores > np.mean(scores)*2.2)
print(Result(labels, y))
nab.plot(anomalies={'ae': labels})
# Sample output:
# ... Result(accuracy=0.93,
# ... (tp, fp, tn, fn)=(135, 0, 3629, 268),
# ... precision=1.0,
# ... recall=0.33,
# ... f1=0.5,
# ... roc_auc=0.67,
# ... y_pred%=0.033482142857142856,
# ... y_label%=0.09995039682539683,
# ... )
Change-point detector searching for points of significant change of behaviour.
References:
-
Lee, W. H., Ortiz, J., Ko, B., & Lee, R. (2018). Time series segmentation through automatic feature learning.
-
Boumghar, R., Venkataswaran, A., Brown, H., & Crespo, X. Behaviour-based anomaly detection in spacecraft using deep learning.
Detector based on auto-encoder model. Returns prediction score based on reconstruction error. This model is capable of working with multivariate time series.
References:
- An, J., & Cho, S. (2015). Variational autoencoder based anomaly detection using reconstruction probability.
Detector based on variational auto-encoder model trained with ELBO.
On prediction phase model generates n
reconstructions of a data point and
returns resulting reconstruction probability.
References:
- An, J., & Cho, S. (2015). Variational autoencoder based anomaly detection using reconstruction probability.
LSTM Anomaly Detection model. Trained on a task of predicting future values of the time series given previous values. The prediction score is computed as a probability of the prediction error coming from the multivariate error distribution estimated with validation data.
References:
- Malhotra, P., Vig, L., Shroff, G., & Agarwal, P. (2015, April). Long short term memory networks for anomaly detection in time series.
LSTM Encoder Decoder model. Trained on a task of reconstructing a window of values of the time series. The prediction score is computed as a probability of the reconstruction error coming from the multivariate error distribution estimated with validation data.
References:
- Malhotra, P., Ramakrishnan, A., Anand, G., Vig, L., Agarwal, P., & Shroff, G. (2016). LSTM-based encoder-decoder for multi-sensor anomaly detection.
References:
- Xu, H., Chen, W., Zhao, N., Li, Z., Bu, J., Li, Z., ... & Qiao, H. (2018, April). Unsupervised anomaly detection via variational auto-encoder for seasonal kpis in web applications.
A wrapper class for semi-supervised learning. Capable of returning training sets with anomalous rows filtered out or limited to a requested percentage.
Collection of handwritten digits. http://yann.lecun.com/exdb/mnist/
A dataset from competition task to build a network intrusion detector. https://archive.ics.uci.edu/ml/datasets/kdd+cup+1999+data
A set of datasets coming from NAB Benchmark for anomaly detection. https://github.com/numenta/NAB
$ pip install .[test]
$ coverage run -m pytest .
$ pip install .[lint]
$ flake8