Skip to content

Graph convolution based neural architecture search with weight sharing. The search attempts to extract usefull features from the graph and modify it using an actor-critic model. Unfortunately, this does not work in the real world (or at least, I have yet to succeed in making this work)

Notifications You must be signed in to change notification settings

KanHarI/GraphEnas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GraphEnas

Graph convolution based neural architecture search with weight sharing. This work was inspired by the Efficient Neural Architecture Search via Weight Sharing paper, and draws from the field of graph convlutions as a way to calculate graph modifying actions.

Unfortunately, the training phase seems to be unreasonably slow and unstable - probably the graph convolution having a single data point for every training batch of the network is part of the blame.

Theory

This algorithm uses a "Supergraph" DAG in which every node is a layer of the neural network, the operation in whom is encoded as a vector of possible layers (3x3 & 5x5 convolutions, depthwise 5x5 & 7x7 convolutions, dialated 3x3 & 5x5 convolutions, 3x3 & 5x5 & 7x7 max pools) with "1" in the current active layer and "0" in the other layers. The connections between the various layers of the graph is implemented by a diagonal matrix containing "1"s at layer connections.

At every stage, an actor chooses a graph modifying action, and a critic tries to guess the performence of the network on a part of the test set after the given change. The neural net is then trained for a fixed amount of time (time is chosen as otherwise the NN can degrade into allowing all possible skip connecitons, as this is a stronger yet much too slow connection). The critic loss is the delta between the actual network performence and the predicted network performance, and the actor loss is based upon it's performence as predicted by the critic.

At every training batch, the shared wights are updated as in the original "Efficient Neural Architecture Search via Weight Sharing" paper.

The activation function after all convolutions is a "fuzzy relu" - a modified softplus that has equal value to a relu at 0 and a derivative of 1 to prevent gradient explosion or diminishing.

Model

The supergraph.py file contains the model generating network. The model.py file contains the implementation of a single model using the output of the supergraph. The graphsage.py file contains the graph convolution neural net.

Running

Install pytorch and scikit-learn. Then run py -3 -m test. Works much better on GPU-equipped systems...

About

Graph convolution based neural architecture search with weight sharing. The search attempts to extract usefull features from the graph and modify it using an actor-critic model. Unfortunately, this does not work in the real world (or at least, I have yet to succeed in making this work)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages