HNAS
1 minute read ∼ Filed in : A paper noteIntroduction
Motivation & Contriubtion
The paper tries to answer 2 questions
- what is the relationship between 4 training-tree NAS algorithms?
- How to prove the transferability of training-free NAS algorithms theoretically.
Candidate algs
Connections
The paper uses many inequality theories to prove that all other matrices can be calculated with M_trace.
The paper examines the Spearman correlation between each matrix with M_trace on a batch of randomly sampled data from C10.
Transferability
Algo generalization performance over Search space
At first, the paper tries to answer the question: Why do existing training-free NAS algorithms with these metrics achieve compelling performances in practice?
The method the paper uses is to get some relationship between loss on test-dataset and metrics. It shows the loss on test datasets is upper bounded by the metrics.
According to the equation, for a non-realizable scenario
- Larger M => better generalization performnace.
- When M > mc/n, there is a trade-off in M between converging and transferability. Larger M => slow convergence, but a small gap.
A larger positive number implies a better agreement between the matrix and the architecture performance.
Architecture transferability over Datasets
And then the paper tries to find if an architecture selected using the training-free metric on a dataset S0 is also likely to produce compelling performances on another dataset S1.
The method the paper uses is to get the relationship between loss on the test dataset and the metrics got from another dataset.
Overall performance
Finally, the paper proposes an algorithm to find architecture quickly.
And the algorithm is to minimize the upper bound of the test error.
mu and v are hyperparameters and can be searched using BO. and set t = 1 in practice.