Skip to content
/ gandi Public

Generative Adversarial Network for Detecting Irregularities. Applying GANs to anomaly detection.

Notifications You must be signed in to change notification settings


Repository files navigation

GANDI: Generative Adversarial Networks for Detecting Irregularities

Harnessing the usually discarded GAN discriminator for the task of anomaly detection.


[For more comprehensive explanation, see the pdf file in the paper folder]

Anomaly detection is hard. The possible number of different types of anomalies cannot be accounted for. Thus, in order to train an anomaly detector we must use a generative model, rather than a discriminative one.

We hypothesize the discriminator performance over time (during training) as parabolic: it begins knowing nothing (randomly initialized) and ends up confused (being the generator winning the competition). However, in between these two ends, we know it learn some meaningful representation of the problem of identifying true data from false one; otherwise it wouldn't be able to contribute to the performance improvement of its generator foe - which we know to improve for sure (we can test that).
at some point in our training, the discriminator outputs values near 1 when encountered with true data and near 0 for data impersonated to be true data (adversarial generated synthetic data). If we could find the sweet-spot where the discriminator acts it's best (i.e. discriminates between true and fake with good identifiable margin), we could pause the training and export (or rather metamorphose) our discriminator as an anomaly detector - feeding it real data will output near constant output (~1) and feeding it anomalies (data that was not "real" during training) will output some other value.

After the discriminator metamorphoses as an anomaly detector, we can test is as if it was a binary classification problem: see how it performs when encountering two sets of data - true data (labeled 1, like during training) and anomaly data (labeled 0). We can then apply several metrics on the resulting confusion matrix, mainly the area under the curve (AUC).


The generator converging to the true data (PDF and CDF):
pdf cdf

The discriminator performance as anomaly detector during training: roc curve of GAN anomaly detector for growing anomalies

More similar results here.


Description of Files:

  • PlayGround - Main file running the project.
  • RunParams - Static file with setting declarations (the NN architecture, training and losses, training parameters, what tests to perform and how often, etc.)
  • Distributions - Class depicting different distributions being used in the demonstration. E.g. wrapping SciPy's Gaussian distribution.
  • DiscriminatorNN - Discriminator class and different discriminator's neural architectures.
  • GeneratorNN - Generator class and different generator's neural architectures.
  • NNbuilds - Helper for defining linear layers and optimizers for TensorFlow networks. Shared to both discriminator and generator.
  • Tracker - An object tracking the progress of training process. It is called every iteration of training and decides id test should be conducted and registered and if there suppose to be any logging of current models parameters.
  • MetricsG - Module of different goodness-of-fit tests to be conducted on the generator during training process.
  • MetricsD - Object for testing the discriminator (as an anomaly detector during training).
  • GAN - Class for training the GAN model.
  • Plots - Plotting class generating different plots based on the resulting statistics of the generator, discriminator (anomaly detector), and the neural network training progress.


Workspace Structure

Running creates many files, each with a unique name (signature) for every model-configuration run:

  • Plots (.png files)
  • TensorFlow's Saver checkpoints
  • Logs (text based)
  • Tensorboard files
  • results (pickle file of plots objects (figure), and raw measurements (DataFrame))

The subdirectories-structure created is as follows:

     |-... [All the rest of the files described above]

Each model configuration will create a unique subdirectory (or file) under this structure using the time of start, random state and configuration setting number.
For example:
GAN_2017-05-23_13-47-50_966105-765644_9 Is an experiment testing setting no. 9 (see train_params in started on May 23rd 2017 on 13:47:50 with NumPy's random state 966105 and Tensoflow's random state 765644.


  1. There's a shebang header in the PlayGround so you can simply
    $ cd SOME_PATH/gandi
    it in the terminal.
  2. In order to change settings and configuration you should edit the variables in
  3. In order to expand the model just add another neural net architecture in GeneratorNN and\or DiscriminatorNN:
    def architecture_n(net_input, ...)
    and update the factory dictionary _arch_types:
    _arch_types = {"architecture_1": architecture_1,
                   "architecture_n": architecture_n}
    and then add it to the train_params dictionary:
    {"d_arch_num": n, "g_arch_num": m}



Please contact before citing, using or if willing to expand this work.
(Only to resolve credits for this not-formally-published work 😊)


Feel free to contact me regarding this project!


Generative Adversarial Network for Detecting Irregularities. Applying GANs to anomaly detection.






No releases published


No packages published