Neural Architecture Search without Training
1 minute read ∼ Filed in : A paper noteIntroduction
Motivation & Contributions
The general idea of the paper is to capture the correlation of activations within a network when subject to different inputs within a minibatch of data the lower the correlation, the better the network is expected to perform as it can differentiate between different inputs well.
The paper uses activations overlap between datapoints in untrained networks to measure the network’s performance.
The paper evaluates the algorithm on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces.
Score method
Evaluation
We are able to search for a network that achieves 92:81% accuracy in 30 seconds within the NAS-Bench-201 search space.
Correlation of Accuracy & Score
These results point to our score being effective on a wide array of neural network design spaces.
Study
- How important are the images used to compute the score?
In the paper, we randomly select 10 architectures from different CIFAR-100 accuracy percentiles in NAS-Bench-201 and compute the score separately for 20 random CIFAR-100 mini-batches.
This suggests our score captures a property of the network architecture, rather than something data-specific.
- Does the initialization influence the score?
better-performing networks remain distinctive and can be isolated
- Does the size of the mini-batch matter?
The best-performing networks remain distinct.
- How does the score evolve as networks are trained ?
We observe that the score increases in all cases immediately after some training has occurred.