Neural Architecture Search without Training

Posted on June 3, 2022   1 minute read ∼ Filed in  : 

Introduction

Motivation & Contributions

The general idea of the paper is to capture the correlation of activations within a network when subject to different inputs within a minibatch of data the lower the correlation, the better the network is expected to perform as it can differentiate between different inputs well.

The paper uses activations overlap between datapoints in untrained networks to measure the network’s performance.

The paper evaluates the algorithm on NAS-Bench-101, NAS-Bench-201, NATS-Bench, and Network Design Spaces.

Score method

image-20220603151641071

Evaluation

We are able to search for a network that achieves 92:81% accuracy in 30 seconds within the NAS-Bench-201 search space.

Correlation of Accuracy & Score

image-20220603152020083

image-20220603152008978

These results point to our score being effective on a wide array of neural network design spaces.

Study

  1. How important are the images used to compute the score?

In the paper, we randomly select 10 architectures from different CIFAR-100 accuracy percentiles in NAS-Bench-201 and compute the score separately for 20 random CIFAR-100 mini-batches.

This suggests our score captures a property of the network architecture, rather than something data-specific.

  1. Does the initialization influence the score?

better-performing networks remain distinctive and can be isolated

  1. Does the size of the mini-batch matter?

The best-performing networks remain distinct.

  1. How does the score evolve as networks are trained ?

We observe that the score increases in all cases immediately after some training has occurred.





END OF POST




Tags Cloud


Categories Cloud




It's the niceties that make the difference fate gives us the hand, and we play the cards.