How Powerful are Performance Predictors in Neural Architecture Search
1 minute read ∼ Filed in : A paper noteIntroduction
The paper compares 31 nas algorithms in 4 search spaces and 4 datasets.
- The algorithm ranges from zero-cost, model-based, learning curve extrapolation, and weight sharing.
- Configs:
- 101+cifar10, darts+cifar10
- 201 + (c10, c100, imageNet),
- nas-bench-NLP + Penn TreeBank.
After those experiments, the paper tries to
- Find which predictors have consistent performance across search space from three dimensions comparison
- Analysis insights.
- Find complementary predictors and invesigate how to combine them
Measure SRCC, Pearson, Kendall Tau.
For prediction model, they train the model with 1k archs and then test on 200 archs.
- low query time + low initialization time, Jacob and Synflow preform well.
- Zero-cost do not perform well on large space like DARTS
- High init time + low quer time, performance predictors are best.
The papre combine 3 metrics from 3 different families.
- SoTL-E from learning curve methods
- Jacob from zero-cost method
- Both of above are used as input feature for a model-based predictor. (SemiNAS and NGBoost. )
After this, the papre measure how to use those metrics to speed up the NAS. It mainly use two methids.
- Predictor-guided evolution framework.
- Bayesian optimization.
This suggests that using zero-cost methods in conjunction with model-based methods is a promising direction for future study.