How Powerful are Performance Predictors in Neural Architecture Search
1 minute read ∼ Filed in : A paper noteIntroduction
The paper compares 31 nas algorithms in 4 search spaces and 4 datasets.
- The algorithm ranges from zero-cost, model-based, learning curve extrapolation, and weight sharing.
- Configs:
- 101+cifar10, darts+cifar10
- 201 + (c10, c100, imageNet),
- nas-bench-NLP + Penn TreeBank.
After those experiments, the paper tries to
- Find which predictors have consistent performance across search space from three dimensions comparison
- Analysis insights.
- Find complementary predictors and invesigate how to combine them
Experiments
Effectiveness
Measure SRCC, Pearson, Kendall Tau.
For prediction model, they train the model with 1k archs and then test on 200 archs.
Conclusion
- low query time + low initialization time, Jacob and Synflow preform well.
- Zero-cost do not perform well on large space like DARTS
- High init time + low quer time, performance predictors are best.
Efficiency
The papre combine 3 metrics from 3 different families.
- SoTL-E from learning curve methods
- Jacob from zero-cost method
- Both of above are used as input feature for a model-based predictor. (SemiNAS and NGBoost. )
After this, the papre measure how to use those metrics to speed up the NAS. It mainly use two methids.
- Predictor-guided evolution framework.
- Bayesian optimization.
This suggests that using zero-cost methods in conjunction with model-based methods is a promising direction for future study.