Learning Transferable Architectures for Scalable Image

Posted on April 28, 2022 2 minute read ∼ Filed in : A paper note

Introduction

Problems

Previous work like RF for NAS has a huge search space and is hard to train on the large datasets since training each child architecture requires lots of time with large datasets.

Contributions

The main contribution of this work is the design of a novel search space, such that the best architecture found on the CIFAR-10 dataset would scale to larger, higher resolution image datasets like ImageNet across a range of computational settings.

Key Feature of the search Space

**The overall architectures of the convolutional nets are manually predetermined. **with

Two kinds of cells

Each cell has 5 blocks, and each block has a fixed architecture with
- Two inputs
- Two operators
- One combination method.
But there are many options for each of the above.
- 2 options for each input.
- 9 options for each operator.
- 2 options for each combination method
Each cell receives as input two initial hidden states hi and hi-1 which are the outputs of two cells in the previous two lower layers or the input image.

Search Target:

All options of each block

Search Method

OverAll

Controller Sampling

The controller uses one-layer LSTM with 100 hidden units. to predict a child’s architecture. And the prediction process is as follows:

The selection result of one hidden state is the input when predicting the second hidden state, And the result of the second hidden state is the input when predicting the first operation of the first hidden state.

Controller Training

After each prediction, the controller will record their probability, so there is a total of 10B prediction results. The controller will use joint probability ( product of all probabilities. ) to compute the gradients by hiring reinforcement learning like PPO

5(# Search target) * B(# block) * 2 ( 2 kinds of cell) = 10B

Learning Transferable Architectures for Scalable Image

Introduction

Problems

Contributions

Key Feature of the search Space

Search Method

OverAll

Controller Sampling

Controller Training

Experiment Result

Performance

Efficiency

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.

Introduction

Problems

Contributions

Key Feature of the search Space

Search Method

OverAll

Controller Sampling

Controller Training

Experiment Result

Performance

Efficiency

END OF POST

Tags Cloud

Categories Cloud

It's the niceties that make the difference fate gives us the hand, and we play the cards.